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Integral  Equations  and  Discretizations 
for  Waveguide  Apertures 

John  J.  Ottusch.  George  C.  Valley,  and  Stephen  Wandzura 


Abstract  We  present  integral  equations  and  their  discretiza¬ 
tions  for  calculating  the  fields  radiated  from  arbitrarily  shaped 
antennas  fed  by  cylindrical  waveguides  of  arbitrary  cross  sec¬ 
tions.  We  give  results  for  scalar  fields  in  two  dimensions  with 
Dirichlet  and  Neumann  boundary  conditions  and  for  (vector) 
electric  and  magnetic  fields  in  three  dimensions.  The  discretized 
forms  of  the  equations  are  cast  in  identical  format  for  all  four 
cases.  Feed  modes  can  be  TM,  TE.  or  transverse  electromagnetic 
(TEM).  A  method  for  numerically  computing  the  modes  of  an 
arbitrarily  shaped,  cylindrical  waveguide  aperture  is  also  given. 

Index  Terms — Aperture  antennas,  integral  equations. 


I.  Introduction 

NUMERICAL  simulation  of  the  electromagnetic  perfor¬ 
mance  of  antennas  using  integral  equations  requires  a 
mathematical  model  of  the  driving  sources.  In  contrast  to 
scattering  cross-section  computations  where  a  distant  source 
creates  a  plane  wave  in  the  vicinity  of  the  scatterer,  construc¬ 
tion  of  an  accurate  source  model  for  an  antenna  is  nontrivial. 
If  a  simple  approach,  such  as  a  “delta-gap*  excitation  [1] 
■is  used,  the  accuracy  of  some  important  antenna  parameters, 
such  as  input  impedance,  gain,  and  reflection  can  be  seriouslv 
compromised,  even  for  cases  in  which  the  far-field  pattern  is 
obtained  accurately. 

The  purpose  of  this  paper  is  twofold.  First,  we  develop 
integral  equations  representing  exact  specification  of  the  field 
emanating  from  an  aperture  of  arbitrary  shape  with  the  field 
entering  the  aperture  left  unconstrained  and  to  be  determined. 
The  exact  definition  of  the  emanating  field  is  accomplished 
by  analysis  of  a  translationally  invariant  waveguide  that  has 
the  cross  section  of  the  given  aperture.  In  the  context  of 
a  generalized  scattering  problem  such  as  a  waveguide-fed 
antenna,  such  an  integral  equation  may  serve  as  a  boundary 
condition  that  must  be  obeyed  inside  the  waveguide  on  anv 
plane  normal  to  its  axis.  Second,  we  derive  discretized  forms 
of  the  integral  equations1  (using  the  method  of  moments) 
that  are  suitable  for  numerical  computation.  As  part  of  this 
development,  we  give  a  useful  interpretation  of  the  kernel  that 
appears  in  the  “waveguide  integral  equation/* 
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1  An  equivalent  formulation  of  the  feed  model  for  the  electromagnetic  case 
has  been  used  previously  by  McGrath  and  Pvati  [2]  We.  how-ever.  try  to 
clarify  the  intent,  development,  and  use  of  this  formulation  in  the  context  of 
a  generalized  method  of  moments  discretization. 


Our  development  is  based  on  the  assumption  that  the 
waveguide  is: 

•  translationally  invariant  in  the  half-space  behind  the  aper¬ 
ture  along  the  axis  normal  to  the  aperture: 

•  terminated  by  a  perfect  absorber  or  is  so  long  as  to  be 
practically  nonreflecting: 

•  filled  with  a  linear,  isotropic,  homogeneous  medium: 

•  enclosed  by  wralls  that  are  infinitely  hard  or  infinitely  soft 
in  the  scalar  scattering  case  or  perfectly  conducting  in  the 
electromagnetic  scattering  case. 

The  first  section  is  devoted  to  finding  continuous  and 
discretized  forms  of  the  waveguide  integral  equations  for 
scalar  waves  and  then  applying  them  to  more  general  scattering 
problems.  These  equations  apply  to  acoustic  scattering  in  two 
or  three  dimensions  as  well  as  the  two-dimensional  (2-D) 
analogues  of  three-dimensional  (3-D)  electromagnetic  scat¬ 
tering  (which  apply  to  scatterers  with  translational  symmetry’ 
m  a  direction  orthogonal  to  the  axis  of  the  waveguide).  In 
the  second  section,  we  do  the  same  for  3-D  electromagnetic 
scattering.  The  two  treatments  are  entirely  analogous.  Formu¬ 
las  for  the  power  flow  out  of  (due  to  the  given  excitation) 
and  into  (due  to  back  scattering)  the  waveguide  are  also 
given  in  each  section.  In  the  third  section,  we  show  how  the 
waveguide  integral  equations  can  be  extended  to  more  general 
circumstances.  Prescriptions  for  numerically  computing  the 
modes  of  cylindrical  waveguides  wdth  arbitrary'  cross  sections 
may  be  found  in  the  Appendix. 

II.  Scalar  Waveguide  Equations 

A.  Modes 

An  arbitrary'  field  ^'(x)  that  satisfies  the  scalar  Helmholtz 
equation 

(V2  +■  k2)w(x)  =  0  (1) 

inside  a  waveguide  aligned  with  the  z  axis,  can  be  written 
as  a  sum  of  modal  components2  traveling  in  the  +z  and  -z 
directions  [3] 

^•2)  =  £(a»ei3"s  +  &ne-i‘3"->n(xi).  (2) 


“For  simplicity,  we  will  assume  that  no  cutoff  modes  fi.e.,  those  with 
J  —  0)  are  present.  It  is  straightforward  to  amend  the  development  to  handle 
such  modes. 
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Likewise,  the  longitudinal  derivative  of  the  field  may  be 
written  as 

t  '  .=  .  if: 


d z 


V'  : j_  -  ,  .  it: 

~  y^'ar-e  ~bne  'iy u„:x. 


where 


(3) 


(4) 


is  the  modal  impedance.  In  these  equations,  an  implicit  (T1- 
time  dependence  is  assumed  for  the  fields,  k  =  xjc  is 
the  free-space  propagation  constant  and  3n  and  un:x_i  are. 
respectively,  the  propagation  constant  and  transverse  held 
distribution  of  the  nth  mode  inside  the  guide.  The  modes  are 
eigensolutions  to  the  scalar  wave  equation 

(VL  ~  k<1  ~  3n)^n(x_)  =  0  (5) 

for  x_  inside  the  waveguide  aperture  \V  and  the  un(x_) 
are  constrained  to  satisfy  the  boundary  conditions  of  the 
waveguide  walls  when  x_  is  on  the  boundary  of  the  aperture 
dir.  With  proper  normalization,  the  modes  form  a  complete 
and  orthonormal  set  of  functions  over  W.  i.e.. 


^  un(x_L)un(x_L)  —  <5(x_  —  x^)  Completeness 


(6) 


and 


dx_um  (x_  )un  (x^)  —  5mn  Orthonormalitv.  (7) 


where  ni  x  j  is  the  outward  unit  norma]  to  5  at  x.  In  the  case 
of  a  waveguide  aperture,  a  simplifies  to 

dvix  .  r 1 

c(x_.  O.i  =  — -  ;  x_  on  ll‘.  (lli 

Inserting  this  into  (8)  and  dropping  the  spatial  coordinate  r. 
we  obtain  the  following  integral  equation  on  the  waveeuide 
aperture  that  relates  the  held,  its  longitudinal  derivative,  and 
the  specihed  waveguide  excitation  on  W: 

2t,OUi(x_J  =  UxO  -  /  dxf^H{x_.x  icix'  ;  (12) 

J\v 

#(x_.x_  )  is  the  kernel  of  the  ‘’square  root’*  of  the  trans¬ 
verse  wave  operator  in  the  sense  that 

J  dx_H (x_ . x__ )H i x_ . x  )  =  (x_.  x"  )  (13) 

where  G_  obeys 

<ji + a-)g_ix^.x:  i  =  -^ix^-x:  i  (i4) 

inside  the  waveguide  and  satisfies  the  boundarv  conditions  on 
the  waveguide  walls. 

A  different  relation  between  t\  a.  and  the  outcome 
wave  is  obtained  if  we  specify  <9rout(x_. z)ldz  instead  of 
rout(x_.c)  to  write 

dvom(x^.O)  ^  ih 


dz 


Eik 

CLn  -=-Un(x_  ) 


B.  Waveguide  Integral  Equation 

Let  t.  (x__.£)  denote  a  specihed  outsoins  wave,  z  =  0 
correspond  to  the  plane  of  the  waveguide  aperture,  and  the 
rest  of  the  waveguide  be  located  in  the  half-space  with  -  <  0. 
bsing  the  modal  expansions  and  the  completeness  relation 
modes,  we  can  write  the  following  expression  for 
l  (x^.  0)  in  terms  of  the  held  and  its  longitudinal  derivative 
on  \V: 

vom(x_.0)  =  £a„un(x±) 

n 

1 

~  9  /  T  ^n)ttn(x_) 

n 

:  lr^4,  ,  .  ik 

~  2  2~*  7pa*  ~  b^Y'Un ^x~) 

=  5^(xj..0)  +  l  J  dx'_H{x_.x'_) 


1  ik 

~~  9  /  4-  bn)——un(x J_) 

“  n  Zn 

1  ik 

““  9  /  t  7  (^n 

"  n  71 

_  1  dv  txf±.z')\ 


2  d 
x  u(x±.0) 


+  \  J  dx\H(x±.x'±) 

2'ss  0  Z  J  W 


(15) 


where3 


H(x±.x\)  =  J2  j-un{x^)un(x'__).  (16) 


Dropping  the  spatial  coordinate  z  and  defining  c  as  before,  we 
get  an  alternative  form  for  the  waveguide  integral  equation 

o^om(x_)  =  dv(xx)  , 

dz  dz  Jxv 


x 


dtlix'_.z') 


*'=0 


where 


H(X^.X'_)  =  Y1  §-Un(x_K(xl). 


(8) 


(9) 


_9 


dv°ut(x_) 


dz 


For  any  point  x  on  a  general  surface  5,  we  may  define  an 
independent  surface  field  quantity 


er(x)  =  -  lim  ri(x)  •  V'r(x'):  x  on  5 


(10) 


■  ^  Jn_dx'±H(x±.x'±)rP(x'±)  (17) 
=  <r(x±)-  I  dx\H(x±.x\)i,(x'±).  (18) 

J  H* 

H(x^.x±)  and  H(x±.x'^)  are  “inverse  operators”  in  the 
sense  that 

J"dx'±H(xx.x'JH(x'±.x'i)  =  6(x±-x'i).  (19) 

1  N°,e  1)131  H(x_.  x\_ )  is  not  a  function  since  the  sum  over  all  n  does 
not  converge.  Rather,  like  the  Dirac  delta  "function"  6t  x  .  x',  )  it  is  a 
distribution,  which,  when  convolved  with  a  suitably  smooth  function,  produces 
a  well-defined  value. 
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eliminate  r.  putting  the  known  field  rou!:x-  on  the  left  anc 
the  unknown  quantin'  oix)  on  the  ncht 


“2  <b  ds' in' 
Jw 


Fig.  1.  Antenna  system  composed  of  waveguide  aperture  U  and  antenna 
sun  ace  5. 


Using  (12 )  and  (18)  on  the  waveguide  aperture  W.  we 
can  derive  boundary  integral  equations  that  apply  to  more 
general  scattering  cases.  For  example,  we  can  write  coupled 
boundary  integral  equations  for  the  case  of  a  waveguide 
aperture  connected  to  a  general  scattered  This  is  demonstrated 
in  the  next  subsection  for  the  special  cases  in  which  the 
scattering  surface  obeys  either  Dirichlet  or  Neumann  boundary 
conditions.  In  both  cases,  it  is  assumed  that  the  union  of  the 
scatterer  5  and  waveguide  aperture  W  forms  a  closed  surface, 
as  indicated  in  Fig.  1. 

C.  Coupled  Integral  Equations 

In  this  section,  we  derive  integral  equations  relating  the 
known  field  emanating  from  the  waveguide  aperture  to  an 
unknown  surface  field  (either  w  or  cr)  for  the  generic  closed 
antenna  system  shown  in  Fig.  1.  For  Dirichlet  (Neumann) 
boundary  conditions  on  5,  the  unknown  surface  field  on  both 
5  and  W  is  chosen  to  be  cr(ti'). 

J )  Dirichlet  Boundary'  Conditions  on  S:  The  integral  equa¬ 
tion  for  the  field  (in  the  absence  of  an  explicit  incident  wave) 
is  ’ [4] 


TfG{x.x))v°-'x’ 


=  f  ds'G\x.  x'  )<7t x'  —  i  dsfG{x.x'  \r\xf 


ds'  i  ri  ■  V'Gix.x  u  j  d>’  H  x.  x  .  x 


for  x  on  S  and 

vou\x)  -  2  i  ds' in'  ■  rG(x.x,llr“;x'i 


=  <f  ds'G(x.x)a(x ) 


=  <f  ds'{[ h(x')  •  V'G(x.x')lvi'x') 

JS=-W  • 

-  G(x.x')a(x')}  (20) 

for  x  on  Sr  W.  The  Helmholtz  kernel  G(x.x')  is  given  by 
r,v  _  /  iHo1}(k\x  -  x'|)  in  2d 

UtX.X  j  =  <  (21) 


where  H0  1  is  the  zeroth-order  Hankel  function  of  the 
first  kind.  For  Dirichlet  boundary-  conditions  on  5  (i.e., 
u-(x  on  5)  =  0)  we  have 

0=  ds'G(x.x')a(x) 

Js?\v 

-  J  *'[n(x')  •  V'G(x.  x')]r(x')  (22) 


for  x  on  5  and 


ds'G(x.x')o r(x') 


+  Jw  ^'[n(x')  •  V'G(x.  x')]ti.(x')  (23) 

for  x  on  \]  .  Equations  (22)  and  (23)  along  with  either  (12) 
or  (18)  form  a  set  of  coupled  integral  equations  to  be  solved 
for  u-(x)  on  W  and  a{x)  on  5  0  W.  Using  (12)  we  can 


-  f  dsf[G(x.xf)(r{xf)  -  l-Hix.x  ict{x'Y' 

Jw  2 

-  /  ds'in*  •  V'Gi x. x7))  f  ds"Hix\x")cr(x"' 

Jw  Jw 

(25) 

for  x  on  W. 

2)  Neumann  Boundary  Conditions  on  S:  The  integral 
equation  for  a  (i.e.  the  normal  derivative  of  the  field)  may¬ 
be  written  as  [4] 

^cr(x)  =  — (n(x)  •  V)  <f  ds' {[nix')  •  V'G(x. x')lr(x') 

Js$\v 

+  G(x.x')a{x')}  (26) 

or 

^(x)  =  f  *'{[“(*)  x  VG(x.x')]  •  [ri(x')  X  r'f(x')! 

-  k2{ rifx)  •  ri(x'))G(x.  x')r(x') 

-  n(x)VG(x.x')cr(x')}  (27) 

for  x  on  Sr  H  The  first  form  is  more  compact  (and  for  that 
reason  is  employed  below),  the  second  more  convenient  for 
numerical  computation.  For  Neumann  boundary  conditions  on 
5  (i.e..  <r(x  on  S)  =  0).  we  have 

0  =  —  (n(x)  •  V)  J  ds'[ n(x')  •  V'G(x.  x')^(x') 

-  (ri(x)  •  V)  J  d.s'{[ri(x')  •  V'G(x.x')]t/>(x') 

+  G(x.x')cr(x')}  (28) 

for  x  on  S  and 

Jff(x)  =  ~(n(x)  •  V)  J  ds'[ ri(x')  •  V'G(x.x')]v(x') 

-  (n(x)  •  V)  J  ds'{[n(x')  •  V'G(x.x')ty(x') 

+  G(x.  x')a(x')}  (29) 

for  x  on  W.  Combining  (28)  and  (29)  with  (18),  we  can 
eliminate  a  and  write  the  following  integral  equations  for  ip(x.) 


1730 
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in  terms  of  the  known  quantin'  dvoul[x)/dz: 

f  dvouz 

2  ds  n-VGix.x'^— — ix') 

Jw  0- 

=  in  •  V)  J  ds\n  ■  V'G(x.x'))v(x') 

-in-V)  j  ds'in  •  V'Gix.  x'lk-fx') 

-  [  ds'in  ■  VGfx.x'))  /  c?s"J^(x/. : 
Jw  Jir 

for  i  on  5  and 

Si-ou: 


')vix"»  (30) 


XI  -  2  /  ds'in  •  VG(X.X 


/,No'r< 


(x  / 


=  in-Vi  J  ds'in  •  VfGix.x'))v(x' ) 

“i n*V)  f  ds'in  ‘VfG(x.x))v(x') 

J  w 

--  [  ds'ff[x.xr)v{x) 

2  Jw 

-f  ds'in  <  VG(x.x'))  [  ds"H(x'.x")v(x")  (31) 
Jw  Jw 


for  t  on  ir. 


D.  Discretization 

While  analytical  solutions  for  waveguide  modes  are  known 
for  a  few  special  cross  sections,  in  general,  modes  must  be 
computed  numerically.  Even  when  analytical  solutions  exist, 
it  is  more  convenient  (from  a  computational  perspective)  to 
use  numerical  solutions  because  then  all  interactins  surfaces, 
whether  physical  or  intangible  (e.g.  waveguide  apertures),  can 
be  treated  equivalently. 

Assume  the  waveguide  aperture  has  been  discretized  into 
a  set  of  patches  that  support  M  basis  functions  /m(x). 
Following  the  procedure  given  in  the  Appendix,  we  can  write 
approximate  expressions  for  the  N  lowest  waveguide  modes 
in  terms  of  basis  functions  defined  on  the  aperture 

M 

W„(x)  =  Anmfm(x).  (32) 

m  =  l 

In  the  usual  method  of  moments  fashion,  we  approximate 
the  field  v  and  its  normal  derivative  a  on  the  aperture  as 
linear  combinations  of  the  basis  functions  with  unknowns 
coefficients  S™  and 

M 

v(x)  ft  yy  5”7m(x)  (33) 

m=l 

\! 

<j(x)  ft  yy  Cfm{x).  (34) 

m—  1 

We  also  approximate  H(x.x')  as  a  truncated  sum  over  the  N 
computed  modes 


H(x.x)  zz  -7--un(x)un(x'). 


Then,  by  substituting  (32h-(35>  into  (12 1.  and  applving  the 
testing  operator  Jlv  dsf^x)-  to  both  sides  of  the  resultant 
equation,  we  arrive  at  the  discretized  form  of  1 12  > 


ovu‘  —  \lv  c,r  _  yu Jxv 

(36 . 

where 

I’1,  =  [  ^r^ixi/j ix 

Jw 

(37a) 

-v”  =  [  dsf,(x)f}ix< 

Jw 

(37b) 

A’”  =  f  ds  [  ds,fl[x)H{x.x'>f1\x'^ 
Jw  Jw 

=  [( AXU  ) 7 A .  .4 .Vu  )*  i; 

(37c) 

and 

7 

\  -  —f 

ik 

(38) 

A  similar  procedure  produces  the  discretized  form  of  (IS), 
namely 

to 

II 

1 

LO 

(39) 

where 

f-tr  f  ,  0vom,  .  , 

* ,  =  J  ds  (x)/,|x) 

(40a) 

-V”  =  [  ds  [  ds'f,(x)H{x.x')fj[x'\ 
Jw  Jw 

=  [(.4.V,r)rA(.4.VM')]y 

(40b) 

and 

\  -  ii6  -  ( \-i\ 

* *-rn-n  _  )mn  • 

^ n 

(41) 

Equations  (12)  and  (18)  and  their  discretized  equivalents 
(36)  and  (39)  may  be  viewed  as  nonlocal  inhomogeneous 
boundary  conditions  that  must  be  obeyed  on  the  waveguide 
aperture.  They  are  nonlocal  because  the  “surface  impedance” 
terms  A  n  and  A  M  relate  the  field  at  one  point  on  the  aperture 
to  its  derivative  not  just  at  the  same  point,  but  everywhere  on 
the  aperture,  and  vice  versa.  The  equations  are  inhomogeneous 
if  excitations  V1'  and  V'u  are  nonzero. 

The  discretized  forms  of  the  coupled  integral  equations 
for  Dirichlet  boundary  conditions  on  5  are  obtained  by  first 
approximating  the  source  on  5  in  terms  of  basis  functions  as 

M 

a(x)  ft  yy  i£/m(x)  (42) 

m  =  l 

then  substituting  this  approximation  and  the  approximate  ex¬ 
pressions  for  vj(x),  <r(x),  and  H(x±.x'± )  on  W  into  (22) 
and  (23)  and  finally  applying  the  testing  function  operator 
Js~\v  dsfi(x)-  to  both  sides.  The  result  in  block  matrix  form 
is 

— 2}''5U  (ArH  " 

V”’ 


'Zss 

Zsw  +  YSW(NW)- 

1  Ar"  " 

'Is  ' 

ZWS 

zww _ ixw 

Iw 

(35) 


(43) 
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where 

Y'  =  J  ds  J  dt'fi(x)(n'  ■V'Gix.x'))fjix')  (44) 

Kf  =  J  ds  l  ds'f,(x)G(x.  x)fjlx)  (45) 

with  5  or  W  replacing  q  and  3. 

An  analogous  result  is  obtained  for  the  case  of  Neumann 
boundary  conditions  on  S.  We  approximate  the  source  on  5 
as 

x r 

r(x)  ft  Sifn(x)  (46) 

substitute  this  expression  and  the  approximate  expressions  for 
ti’(x).  a('x).  and  on  W  into  (28)  and  (29)  and  then 

apply  the  testing  operator.  The  result  is 

’2ys"’(A:U)_1ru" 

-f"' 


r  zss 

zsw  -  ysu'(Ar"')-1y"'1 

'5s  ‘ 

[z"’s 

N» 

T  , 

tol^ 

Ixj. 

1 

Sir 

where 

=  /  ds  fds'fi(x)(n-‘VG(x.x'))fj(x')  (48) 

J  S  J  w 

K3  =  J  &  j  ds'[fi(x)[ n(x)  x  VG(x.x')]  ■  (n(x') 

x  V'/;(x')--/r2(n(x)  •  n(x'))/;(x)G(x.  x')/,(x')] 


with  5  or  W  replacing  a  and  6. 

E.  Modal  Decomposition 

In  preparation  for  computing  the  power  flowing  across  the 
waveguide  aperture  in  either  direction,  it  is  useful  to  write  r 
and  d^/dz  in  terms  of  modes  propagating  in  either  direction. 

By  employing  the  completeness  relation  for  the  modes  we 
can  decompose  the  field  on  W  into  a  sum  over  modes  as 

v(x)  =  '^2r,nun(x)  (50) 

n 

where 

Vn  =  dsun(x)lp{x)  (51) 

J  it¬ 
's  the  amplitude  of  the  nth  mode  contained  in  t/,-(x).  It  is  useful 
to  further  decompose  v(x)  into  its  incoming  and  outgoing 
components 


t/>(x)  =  p'n(x)  +  t/-,out(x).  (52) 

Since  the  discretized  representation  of  i L-out(x)  is  given  by 
,  we  may  write  the  discretized  form  of  r?°ut  as  ~ 

=  EA™V"  ■  (53) 


Using  (12)  to  eliminate  fix I.  we  amve  at  the  discretized  form 
of  Tin 


(54) 


Similarly,  we  may  decompose  the  longitudinal  derivative  of 
the  field  as 


dvix) 

dz 


-  y  7 7„ur{x 


where 


Then,  using 


f  ,  N  dv  x 
Vn  -  asun\x ;• - . 

J\V  O Z 


dvix)  dvAnix)  dvom(x) 


(55i 


(56) 


(57) 


dz  dz  dz 

and  (18).  we  can  write  fj°ux  and  rj™  in  discretized  form  as 

Vnm  =~E  A™i’m  (58) 

m 

and 


c  =  -£-< 


nml  • 


4-  A’"  S"  ) 


(59) 


r.  rower 


The  time-averaged  power-flow  density  vector  (the  scalar 
equivalent  to  the  Poynting  vector)  is  [5] 


(S(x))  =  i  Re[zcu.’r(x)Vu(x)*] 


(60) 


where  c  is  a  constant. 

The  total  power  flowing  across  the  waveguide  aperture  in 
the  z  direction  is  made  up  of  an  incoming  part  associated 
with  the  incoming  pans  of  t'  and  dib/dz  and  an  outgoing  part 
associated  with  the  outgoing  pans  of  r  and  dxl'/dz .  The" total 
power  exiting  (entering)  the  waveguide  aperture  is  given  by 


P°  =  [  ds{ SQ(x)-z) 
Jw 

=\Y 


icw\L'Q{x) 


diba(xY 


dz 


(61) 


for  q  =  out  (in).  This  integral  is  most  conveniently  evaluated 
by  decomposing  %ba  and  dil’a/dz  into  their  modal  compo¬ 
nents.  The  reason  is  that  since  the  modes  are  orthogonal,  the 
power  in  the  sum  over  modes  is  equal  to  the  sum  of  the  powers 
in  each  mode. 

The  amplitude  of  the  nth  outgoing  (incoming)  mode  con¬ 
tained  in  t/?(x)  is  77°ut  (77^).  Therefore,  the  time-averaged 
power  exiting  (entering)  the  waveguide  aperture  is 


••max  q 

P°  =aokY  ^ 
97 


2Z„ 


(62) 


for  q  =  out  (in),  where  nmax  is  the  largest  value  of  n  for  which 
3n  is  real.  We  exclude  modes  with  imaginary  propagation 
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constants  since  such  modes  do  not  transpon  any  power  into 
or  out  of  the  guide  on  average. 

The  amplitude  of  the  nth  outgoing  (incoming)  mode  con¬ 
tained  in  dv/dz  is  fj°m  (77JJ1).  Therefore,  the  time-averaged 
power  exiting  (entering)  the  waveguide  aperture  is 


n 


for  a  =  out  fin). 

h  Acoustic  Waves:  If  v  is  the  velocity  potential,  i.e.,  v  = 
V r.  and  p  is  the  mass  density,  then  the  constant  c  in  (60)  is 
given  by 


c  =  P  (64) 

Furthermore,  the  acoustic  impedance  [5]  is  related  to  our 
modal  impedance  by 


Z™ic  =  fpZn.  (65) 

2 )  Electromagnetic  Waves  in  Two  Dimensions:  Suppose  a 
waveguide  whose  axis  is  parallel  to  z  is  also  translationally 
invariant  in  the  y  direction,  i.e..  the  waveguide  consists  of 
a  pair  of  half-infinite  plates  parallel  to  the  yz  plane.  When 
a  geometry  is  translationally  invariant  in  one  direction,  the 
electromagnetic  scattering  problem  can  be  decoupled  into 
two  independent  problems,  each  of  which  is  isomorphic  to 
a  2-D  scalar  scattering  problem  with  a  different  boundary 
condition.  If  the  3-D  surfaces  are  perfectlv  conducting,  the 
boundary  conditions  for  the  corresponding  scalar  fields  on 
the  corresponding  2-D  surfaces  become  either  Dirichlet  or 
Neumann. 

Solutions  to  the  scalar  waveguide  problem  with  Dirich¬ 
let  boundary  conditions  inside  the  waveguide  correspond  to 
solutions  to  the  electromagnetic  waveguide  problem  with 
exclusively  TE  modes  inside  the  waveguide  according  to 

E(xj  =  v(x)x.  Hi  x)  =  ^~x  x  z  Dirichlet/TE  (66) 

IsjC  fj 

and  solutions  to  the  scalar  waveguide  problem  with  Neu¬ 
mann  boundary  conditions  inside  the  waveguide  correspond 
to  solutions  to  the  electromagnetic  waveguide  problem  with 
exclusively  TM  modes  inside  the  waveguide  according  to 

Hix)  =  v(x)x.  Eu)  =  — U  x  x  Neumann/TM.  (67) 

luJC 

Note  how  the  correspondence  between  TM  or  TE  polar¬ 
ization  and  Dirichlet  or  Neumann  boundary'  conditions  in  the 
waveguide  mode  case  differs  from  the  correspondence  between 
TM  or  TE  polarization  and  Dirichlet  or  Neumann  boundary 
conditions  in  the  case  of  scattering  from  perfect  conductors. 
On  a  perfect  conductor  we  associate  TM-polarized  electromag¬ 
netic  scattering  with  solutions  to  the  scalar  scattering  problem 
with  Dirichlet  boundary  conditions  according  to 

E(x)  =  t!>(x)y.  H(x)  =  ^Eiy  x  n  Dirichlet/TM  (68) 


and  we  associate  TE-polanzed  electromagnetic  scattering  with 
solutions  to  the  scalar  scattering  problem  with  Neumann 
boundary  conditions  according  to 

Hix  =  ririy.  Eixi  = - ri  x  v  NeumannTE  (6Qi 

where  v  is  the  direction  of  translational  invanance  and  ri  is 
the  outward  surface  normal.  Therefore,  the  waveguide-excited 
electromagnetic  scattering  problem  with  TM  (TE)  polarization 
in  which  all  the  scattering  surfaces  are  perfect  conductors,  is 
equivalent  to  the  waveguide-excited  scalar  problem,  in  which 
Neumann  (Dirichlet)  boundary  conditions  hold  on  the  inner 
walls  of  the  waveguide  and  Dirichlet  (Neumann)  boundary 
conditions  hold  on  all  the  surfaces  of  all  the  scatterers. 

For  electromagnetic  waves  in  two  dimensions,  the  constant 
c  in  (60)  is  given  by 

f  Dirichlet/TE 

(70) 

{ rfrf  Neumann/TM 

where  p  and  e  are  appropriate  to  the  material  inside  the  guide. 


IE.  Electromagnetic  Waveguide  Equations 


A.  Modes 


The  electric  and  magnetic  fields  inside  a  waveguide  with 
perfectly  conducting  walls  can  be  decomposed  into  modal 
components  just  as  the  field  and  its  normal  derivative  were  in 
the  scalar  case.  The  essential  difference  is  that  now  there  are 
three  distinct  categories  of  modal  fields,  namely  TM.  TE.  and 
transverse  electromagnetic  (TEM):  each  is  a  vector  function 
rather  than  scalar  function.  For  our  purposes,  it  is  sufficient 
to  consider  only  the  transverse  components  of  the  electric 
and  magnetic  fields.  Assuming  the  guide  is  uniformly  filled 
with  a  nondissipative  medium  having  dielectric  constant  e  and 
magnetic  permeability  p%  we  may  write4  [6] 


Ej. (x±.  z)  —  +  bnt 

77 

H_(x_.s)  =  -  6ne-‘3" 


:)Un(Xi) 

1  - 

-)  — 2  X  U„ 


(71) 


(xj 


(72) 

where  the  modal  impedance  Zn  is  given  by 

r~  (  x .  for  77  6  TM  modes 

Zn  =  w  -  x  <  1.  for  77  6  TEM  modes  (73) 

*  6  I  37  for  n  €  TE  modes. 

The  modes  are  the  eigensolutions  to  the  transverse  Helmholtz 
equation 


Ti  +  k2  -  /?;) un(Xj.)  =  0  (74) 


for  inside  the  waveguide  aperture  W  and  un(x^)  con¬ 
strained  by  the  perfect  electrical  conductor  boundary  condition 
on  d\\  .  With  proper  normalization,  the  modes  form  a  complete 

4As  in  the  scalar  case,  cutoff  modes  are  neglected. 

5  6  (x  —  x')  is  a  tensor  distribution,  which,  for  any  vector-valued  surface 
functions  f(x)  and  gtx)  on  \V  obeys 

J^  ds’Ux)-  t  (x-x')-g(x')  =  f(x)-g (x). 


OTTL'SCH  et  a!.-.  INTEGRAL  EQUATIONS  AND  DISCRETIZATIONS  FOR  WAVEGUIDE  APERTURES 


and  onhonormal  set  of  functions  over  W,  i.e.. 

y  Un(X_  IUn(xM  I x_  -  x^_v 

r. 

Z*  X  X  Un(X^)!  =  <*  ( x_  -  x^  ' 

r 

Completeness' 

and 


L 


dx 


( X_  »  *  Un  (  X_  I  = 


Orthonormalitv. 


(75) 

(76) 


where  the  dyad 

H  (x_.x^  ;  =  ^2  4ur;x.  ur;lx_  <S1  • 

n 

is  the  analogue  of  the  scalar  function  H ix_.x^\  Dropping 
the  spatial  coordinate  r.  we  get  the  following  expression  for 
the  waveguide  integral  equation  on  IT.  which  relates  the 
transverse  components  of  the  electric  field,  the  magnetic  field, 
and  the  specified  electric  field  waveguide  excitation  on  W: 


B.  Computation  of  Vector  Modes  from  Scalar  Functions 


The  TM  and  TE  modes  can  be  deduced  from  the  solutions 
to  the  scalar  Helmholtz  equation  on  W  with  Dirichlet  and 


Neumann  boundary  conditions,  respectively,  on  d\V  [6].  The 
TM  mode  corresponding  to  the  nth  scalar  waveguide  mode 
rn(x_j  obeying  Dirichlet  boundary  conditions  on  dW  is 


UnlX,  )  = 


(77) 


and  the  TE  mode  corresponding  to  the  nth  scalar  waveguide 
mode  t>n(xj_)  obeying  Neumann  boundary  conditions  on  d\V 
is 


Un(X_J 


z  x  V ±vn(x_) 


(78) 


2E°ut(x_)  =  E_ix_)  -  /  dx'_  H  <x_.x'  'I 

J\Y 

•  i z  x  HMxM).  (82) 

Defining  equivalent  electric  and  magnetic  currents  on  IF  bx 

J(xM  =  z  x  H_(x_)  (83) 

M(x_)  =  -z  x  E_[x_)  (84) 

allows  us  to  write  the  waveguide  integral  equation  in  terms  of 
equivalent  currents  as 

2E°ut(xj_)  =  z  x  M(xx)  -  [  dx\  H  (x_.x'  )  ■  J(x': ). 

J\v 

(85) 

If  H°ut(x_.c)  is  specified  instead  of  E°ut(x_c},  we  may 
write 


TEM  modes  are  possible  if  and  only  if  IF  is  multiply- 
connected,  in  which  case  they  are  related  to  solutions  to  the 
electrostatic  potential  problem  on  IF.  The  TEM  mode  corre¬ 
sponding  to  the  solution  Cn(x)  to  the  electrostatic  potential 
problem  on  If  with  all  except  the  nth  boundary’  at  zero 
potential  is  given  by 

un(xj  ;x  V_Cn(x±).  (79) 

The  scale  factor  should  be  chosen  to  enforce  orthonormality 
for  the  TEM  modes.  This  amounts  to  assigning  a  particular 
value  to  the  otherwise  arbitrary  potential  on  the  nth  boundary-. 
For  all  TEM  modes.  8n  =  k. 


Hl“t(xx.0)  =  X;ani-3xun(xi) 

n  Zn 

=  9  ^(ar>  +  bn)—Z  X  U„(Xj.) 

"  n 

1  ^ '  1  / 

~  7“ian  ~bn)zx  U„(XX) 

“  n 

=  ^H_(Xj..O)  +  7}  J  dx'_  H  (x±.x'j_) 

■(z  x  Ex(xx.O))  (86) 


where  the  dvad 


C.  Waveguide  Integral  Equation 

Let  E°m(xx.z)  be  the  transverse  component  of  electric 
field  for  a  specified  outgoing  wave.  Using  the  modal  expan¬ 
sions  and  the  first  completeness  relation  for  the  modes,  we  can 
write  the  following  expression  for  E^ut(xj..  0)  in  terms  of  the 
transverse  components  of  the  electric  and  magnetic  fields  on 
IF: 


E°ut(x_L.O)  =  ^anun(xi) 

n 

1  r — \ 

=  2  +  M“n(xX) 

n 

+  9  Y^Zn{a.n  -  fcn)  J-Un(x_) 


=  ^E_(xj_.  0)  —  1  J  dx'_  H(x^.xl) 

•  (z  x  H_(xx.O))  (80) 


H  (xj..xi)  =  Yi  4-(z  x  u„(xx))(z  x  u„(xx))  (87) 

n 

is  the  analogue  of  the  scalar  distribution  H{x±,  x'±).  Dropping 
the  spatial  coordinate  z ,  we  get  an  alternative  form  of  the 
waveguide  integral  equation 

2Hlut(xx)  =  H±(x_)+  [  dx*±  H{x±.x’±) 

Jw 

•  (z  x  E_l(x/_l))  (88) 

or  in  terms  of  equivalent  currents 

2Hxut(xx)  =  -z  x  Jx(xx)  -  [  dx'±  H  (xx.xl) 

Jw 

(89) 

Equations  (85)  and  (89)  are  the  electromagnetic  counterparts 
of  the  scalar  waveguide  integral  equations  given  in  (12)  and 
(18). 


PJJ 
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D.  Discretization 

As  slated  above,  the  TM  and  TE  vector  modes  on  II*  are 
derivable  from  the  scalar  modes  on  It'  with  Dirichlet  and 
Neumann  boundary  conditions  on  <911  .  respectively,  and  the 
TEM  vector  modes  (if  any)  are  derivable  from  the  solutions  to 
the  electrostatic  potential  problem  on  It  .  One  can  compute  ap¬ 
proximate  solutions  for  the  scalar  modes  and  the  electrostatic 
potential  by  putting  scalar  basis  functions  on  It*  and  following 
the  procedure  given  in  the  Appendix.  Once  this  has  been 
accomplished,  one  has  to  choose  between  keeping  the  repre¬ 
sentation  of  the  modes  in  terms  of  the  scalar  discretization  or 
convening  it  to  an  equivalent  vector  discretization.  If  the  scalar 
discretization  is  kept  on  the  aperture,  specialized  code  must 
be  written  to  handle  interactions  with  the  waveguide  aperture. 
On  the  other  hand,  if  the  waveguide  modes  are  convened  to 
a  \ector  discretization  early  on.  then  the  interactions  between 
the  various  scattering  surfaces,  whether  physical  or  waveguide 
apenure.  can  be  handled  in  a  consistent  fashion,  i.e..  entirely 
in  terms  of  vector  basis  functions.  For  computations  involving 
more  than  just  the  waveguide  alone,  we  find  the  later  choice 
to  be  the  simplest  and  cleanest  to  implement. 

If  we  discretize  the  electric  current  JfxJ  and  magnetic 
current  M(x_)  on  II  in  terms  of  M  vector  basis  functions 
f™  ( x_ )  using 


M 


J(x_ 

777  =  2 

(90) 

M(x_j 

M 

1  *=  E  S™  (f™(x_)  X  Z) 

m=l 

(91) 

we  ma\  write  the  first  waveguide  integral  equation  (85)  in  its 
discretized  form  as 


21’”’  =  \w sxv  - 

A'"  7” 

(92) 

w'here 

i;”-  = 

[  dx_Eou:\x  )  •  f,(x 

_) 

(93  a) 

-v,y  = 

j  dx_ ) 
J\v 

(93b) 

I  dx_  f  dx.  fifx.y 
JW  J\Y 

H  (x_.x(_) 

Similarly,  the  discretized  form  of  the  second  waveguide  inte 
gral  equation  (89)  becomes 

of,r  =  Xu 7U  -  Vir 

where 


f  9* 


i’;,r=  /  ;f 


nv 


x_  >  x  z  ■ 


(9Sa 


A"  =  /  ax_  /  dx'  (f.  :x_ 
J\V  J\Y 

■  (  ^(X1  '  x  Z  ' 

=  [iB.Vir)rA!5.Vir)il; 


X  z  H 


(98b) 


and 


Amr,  — 


'  - 1  imt i  • 


(99) 


E.  Coupled  Integral  Equations  in  the  Perfect  Conductor  Case 

Suppose  the  waveguide  IT  is  the  primary  source  of  radiation 
for  a  general  antenna  problem  in  which  all  other  scattering 
surfaces  S  may  be  treated  as  perfect  conductors.  If  there  are 
no  other  sources,  the  electric  field  integral  equation  (EFIE)  for 
x  on  5  6  H*  is  [7] 


0  = 


)  x 


Mix) 


x  G(x.x')  •  J(x')  j-  V'Gix.x')  x  M(x') 


(100) 


The  tangential  component  of  the  electric  field  vanishes  on  a 
perfect  conductor:  hence.  M  =  0  on  S.  At  this  point,  we  could 
rewrite  the  above  equation  in  the  separate  forms  appropriate 
to  x  on  S  and  x  on  IT  and  eliminate  M  on  IT  by  means 
of  (85).  thereby  obtaining  a  set  of  coupled  integral  equations 
for  the  fields  on  S  and  IT.  just  as  we  did  in  the  scalar  case. 
Then  we  could  convert  them  to  discretized  form.  Alternatively, 
we  could  discretize  (100)  as  it  stands,  eliminate  the  unknown 
equivalent  magnetic  current  amplitudes  on  IT  using  (92)  and 
achieve  the  discretized  form  directly.  For  brevity,  we  follow 
the  latter  approach. 


A  discretized  version  of  (100)  in  block  matrix  form  is 


0 

0 


-zss  zsir 

ztrs  zirir 


yS  IV 

V"' 

2J> 


'7s  1 
I'y 

sw 


(101) 


=  [(BA'HyA(S.Vir)]0-  (930 

and 

um(x)  =  y>mnfn(x)  (94) 

n 

Amn  Zn6mn-  (95) 

We  get  the  elements  of  Bmn  by  computing  inner  products  of 

the  vector  basis  functions  with  gradients  of  the  scalar  basis 
functions.  For  example,  if  um  corresponds  to  a  TM  mode,  it 
is  clear  from  (32),  (77),  and  (94)  and  the  definition  of  A',r 
that  the  entries  in  the  mth  row  of  Bmn  are  given  by 

B-~ 


where 

Z^3  =  J^ds  j^ds'{f(x)  ■  +irv'jc(x.x') 

'  (102) 
X 3  =  l  ds  l  ds'  f“(x)  •  (V'C(x.x')  x  (ff(x')  x  ri')) 

(103) 

with  S  or  PI  replacing  a  and  0  and  Is  representing  the  block 
of  unknown  current  amplitudes  on  5,  which  is  related  to  the 
electric  current  J  on  5  by 

J(x)*£/Sfm(x).  (104) 

m 

Rewriting  (92)  as 

w  =  2(.VH  )_1I'lr  j-  (Ar")-1A'U7U 


(96) 


(105) 
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we  can  eliminate  the  block  of  unknowns  Su  in  favor  of  7U 
to  obtain  the  discretized  version  of  (100)  in  its  simplest  block 
form 


r"' 


■  zss 

zsiv  -Ys'v(x"  r 

I - 

z’rs 

7u  tr  _  |A-.r 

(106) 


F.  Modal  Decomposition 

By  employing  the  first  completeness  relation  for  the  modes, 
we  can  decompose  the  transverse  part  of  the  electric  field  into 
a  sum  over  modes  as 

E_(x)  =  ^  77„u„(x)  (107) 

71 

where 

Vn  =  [  dsun(x)  - E_(x)  (108) 

J  w 

is  the  amplitude  of  the  nth  mode  contained  in  E_l(x).  It  is 
useful  to  further  decompose  Ejl(x)  into  its  incoming  and 
outgoing  components 

E_(x)  =  E*(x)  +  E?ut(x).  (109) 

Since  the  discretization  of  E^ut(x)  is  given  by  Fu\  we  may- 
write  the  discretized  form  of  p°ul  as 

ur  =  £-4m»r"‘-  (110) 

771 

Using  (85;  to  eliminate  Ej_(x),  we  arrive  at  the  discretized 
form  of  77^ 

C  =  (HD 

771 

Similarly,  by  employing  the  second  completeness  relation 
for  the  modes,  we  may  decompose  the  transverse  part  of  the 
magnetic  field  as 

H.(x)=gij„(zxu„(x))  (112) 

71 

where 

T)n=  ds(z  x  u„(x))  •  HN(x).  (113) 

Jw 

Then,  using 

Hj_(x)  =  Hj£(x)  +  H°ut(x)  (114) 

and  (89).  we  can  write  fj°m  and  ij™  in  discretized  form  as 

=  (H5) 

771 

and 


G.  Power 

The  time-averaged  power-fiow-density  vector  tPoyntim: 
vector)  is  [6] 


(Six)) 


=  i  Re!E_ 


H 


iir 


The  total  power  flowing  across  the  waveguide  aperture  m  the 
z  direction  is  made  up  of  an  incoming  part  associated  with  the 
incoming  parts  of  E_  and  H_  and  an  outgoing  part  associated 
with  the  outgoing  pans  of  E_  and  H_ .  The  total  power  exiting 
(entering)  the  waveguide  aperture  is  given  by 


_  f 

Jw 


ds( SQ(x <  ■  z 
r 

ds  ReTTUx;  x  Ha:x' 


(118) 


for  q  =  out  (in).  This  integral  is  most  conveniently  evaluated 
by  decomposing  E^  and  IP  into  their  modal  components, 
since  the  modes  are  orthogonal  and  the  power  in  the  sum  over 
modes  is  equal  to  the  sum  of  the  powers  in  each  mode. 

The  amplitude  of  the  nth  outgoing  (incoming)  mode  con¬ 
tained  in  Ej_(x)  is  rj £ut(7^n).  Therefore,  the  time-averaged 
power  exiting  (entering)  the  waveguide  aperture  is 


pa  =  H[ 


k1* 


2Zn 


(119) 


for  a  =  out  (in)  where  nmax  is  the  largest  value  of  n  for  which 
3n  is  real.  We  exclude  modes  with  imaginarv  propagation 
constants  since  such  modes  do  not  transport  anv  power  into 
or  out  of  the  guide  on  average. 

The  amplitude  of  the  nth  outgoing  (incoming)  mode  con¬ 
tained  in  H_  is  r}°ul  ( fj Jf ).  Therefore,  the  time-averaged  power 
exiting  (entering)  the  waveguide  aperture  is 


pa 


E 


\Pn\ 


(120) 


for  a  =  out  (in). 


IV.  Extensions 

Up  to  this  point,  we  have  assumed  that  all  energy  coupled 
into  incoming  traveling  modes  is  completely  absorbed.  It  is 
possible  (at  the  cost  of  some  extra  complication)  to  relax  this 
assumption,  as  we  now  demonstrate  for  scalar  scattering. 

Suppose  a  uniform  waveguide  is  terminated  after  length  L 
by  a  wall  (oriented  perpendicular  to  the  axis  of  the  guide) 
whose  reflectivity  for  the  mth  waveguide  mode  is  rm.  For  the 
time  being,  assume  no  independent  sources  are  located  inside 
the  guide.  Every  mode  that  enters  with  amplitude  bn ,  exits  with 
amplitude  an  =  rneiB"-Lbn ,  i.e.,  if  tfin(x)  =  6nun(x) 
comes  in.  then  tboul(x)  =  £n  rnew"2i&niin(x)  goes  out. 
This  expression  for  •tjjout(x )  can  be  rewritten  as 

^■out(x)  =  f  ds'R(x.x')tl)m(x')  (121) 
Jw 


where 


<116) 

m 


.R(x.x')  =  y%nei3"2iun(xK(x').  (122) 
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After  discretization,  f  1 2 1 )  becomes 

-Vu  5ou!  =  !  ,4.VU  \tR‘  .4.V"  )5'n  ( 1 23 » 

where  R  is  a  diagonal  reflectivity  matrix  whose  elements  are 

R,j  =  r,e,3’2Lft].  (124) 

A  boundary  condition  relating  v  and  a  on  H'  can  be  obtained 
by  applying  the  operator  /„.  ds' iftx.x'i  +  Rix.x'n-  to  both 
sides  of  (12)  and  using  (52).  The  result  is 


Jy  ds'(6(x  -  x')  -  Rix.x'))  ds"  Hix'  .x"  )<y{x" ) 

=  /  ds'itix.xi  -  R[x.x'))vix  )  (P 

J\v 


or  in  discretized  form 


l-V,r  -  R)(NU  Xu  Iu  =  (A’u  -  R)S"\  (126) 

The  discretized  relation  takes  a  particularly  simple  and 
appealing  form  if:  1)  the  basis  functions  used  on  W  are 
orthonormal  in  which  case  AMr  =  1  and  2)  if  as  many  modes 
are  computed  as  there  are  basis  functions  on  If*  in  which  case 
A7 A  =  1.  Then  (126)  is  equivalent  to 

TX"' Iw  =  Sxv  (127) 


where 


TtJ  =  [ATtA}ij  (128) 

and 


f  —  1  T™  c 

L mn  ~  “  ~CTnn 

•t  r  m 


is  the  diagonal  transmission  matrix  giving  the  amplitude 
transmission  of  each  mode  at  the  waveguide  aperture. 

It  is  eas\  to  modify  these  relations  to  allow-  for  a  specified 
outgoing  wave.  Suppose  the  field  c^fx)  is  specified  as  being 
emitted  from  the  aperture  in  addition  to  the  reflected  wave.  i.e~ 
d  (x)  =  i';5pec(x)  -r  vrefiix).  We  use  vrefl(x)  here  to  refer 
to  the  quantity  on  the  left  side  of  (121).  The  result  is 


/  ds'(S(x  -  x')  +  R(x. x'))  f  ds" H{x' .  x")a(x") 

Jvv 

=  J  ds'(6(x.x')  -  R(x.x'))v(x')  -  2v,pee(x).  (130) 
Its  discretized  form 

2V'spec  =  (-V"  -  R)Sn'  -  (A',v  +  R)(Xxv)~1Xu  Iu 

(131) 

is  the  obvious  analog  to  (36)  and  reduces  to  it  for  R  —  0. 

Even  more  generally,  one  can  imagine  the  situation  in 
which  each  incoming  mode  can  be  scattered  into  one  or  more 
outgoing  modes.  Any  number  of  practical  effects  (such  as 
nonuniformities  in  the  cross  section  or  imperfect  termination) 
could  cause  this  to  happen.  In  such  a  case,  the  reflectivity 
matrix  R  contains  the  amplitude  for  every  mode  to  scatter 
into  every  other  mode  and  is  no  lonser  diasonal. 

Analogous  results  obtain  for  the  aJtemative  form  of  the 
scalar  waveguide  boundary  condition  and  for  the  vector  cases. 


V.  Summary 

As  the  previous  discussion  illustrates,  the  equations  that 
describe  scattering  interactions  with  waveguides  can  be  pu: 
into  simple  forms  that  are  common  to  scalar  scattennc  and 
\ector  scattering.  For  example,  the  boundary  condition  on  a 
waveguide  aperture  may  be  written  in  both  cases  as 

2VU  =  _Vn‘5ir  -  A'u7u  ,132) 

or 

21'“'  =  .v"7"'  -  A'“  5”  .  (133) 

In  the  scalar  case,  the  unknown  amplitudes  Iu  and 
are  related  to  the  held  v  and  its  longitudinal  derivative  r 
according  to  (33)  and  (34):  the  matrices  .Yu  .  A'u  .  and  A’u 
and  the  vectors  1  u  and  \'u  are  given  bv  (37)  and  (40). 
In  the  vector  case,  the  unknown  amplitudes  Iu  and  5U 
are  related  to  the  equivalent  electric  and  magnetic  currents 
J  and  M.  according  to  (90)  and  (91):  the  matrices  AMV.  A'u  . 
and  A u  and  the  vectors  Vu  and  Vxv  are  given  by  (93) 
and  (98).  The  discretized  equations  for  scalar  scattering  when 
W  obeys  the  waveguide  boundary  condition  and  5  obeys 
Dirichlet  boundary  conditions  [see  (43)]  are  also  identical 
to  the  equations  for  vector  scattering  when  W  obevs  the 
waveguide  boundary’  condition  and  5  is  perfectly  conductins 
[see  (106)].  The  commonalirv  extends  to  the  expressions  for 
power  transport  into  and  out  of  the  waveguide  as  well. 

Appendix 

Construction  of  the  A'  and  A’  matrices  that  appear  in  the 
discretized  expressions  for  the  waveguide  boundary'  condition 
requires  an  approximate  representation  of  the  eigenmodes  in 
terms  of  basis  functions  on  patches  covering  the  waveguide 
aperture  as  well  as  the  eigenvalues  associated  with  these  eigen¬ 
modes.  For  a  few  geometries  such  as  rectangular  waveguide 
and  coaxial  waveguide,  complete  analytical  solutions  for  the 
eigenmodes  are  known.  In  such  cases,  it  is  a  simple  matter 
to  calculate  the  projection  of  a  given  eigenmode  onto  the  set 
of  basis  functions.  In  the  general  case,  an  eigenvalue  equation 
must  be  constructed  for  computing  the  modes. 

In  this  Appendix  we  describe  a  means  for  computing  the 
modes  of  cylindrical  waveguides  of  arbitrary  cross  section. 
There  are  three  subsections.  The  first  and  second  subsections 
describe  methods  for  numerically  solving  the  scalar  Helmholtz 
equation  for  the  waveguide  modes  when  the  waveguide  walls 
obey  either  Dirichlet  or  Neumann  boundary  conditions,  respec¬ 
tively.  The  third  subsection  describes  a  method  for  numerically 
solving  the  scalar  Laplace  equation  for  the  electrostatic  poten¬ 
tial  of  a  multiply-connected  cylindrical  waveguide,  all  but  one 
of  whose  surfaces  is  held  at  zero  potential. 

The  Helmholtz  modes  are  directly  applicable  to  scalar  prob¬ 
lems  such  as  acoustic  radiation  and  scattering.  The  Helmholtz 
and  Laplace  modes  are  applicable  to  electromagnetic  radiation 
and  scattering  problems  in  that  the  TM  and  TE  modes  can 
be  deduced  from  the  scalar  Helmholtz  modes  with  Dirichlet 
and  Neumann  boundary'  conditions,  respectively,  and  the  TEM 
modes  are  derivable  from  the  scalar  Laplace  modes.  The 
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correspondence  is  described  further  in  Section  HI-B  of  the 
main  text. 

We  will  assume  the  availability  of  scalar  basis  functions  that 
are  continuous  across  patch  boundaries.  A  simple  example  of 
such  a  basis  function  is  a  function  that  spans  two  triangular 
patches  sharing  a  common  edge  and  whose  value  goes  linearly 
from  unity  on  the  common  edge  to  zero  at  the  opposing 
vertices.  The  extension  of  continuous  scalar  basis  functions 
to  higher  order  polynomials  in  the  surface  parameterization 
results  in  three  types  of  basis  functions  that  may  be  classified 
according  to  whether  they  span  two  patches  that  share  a 
common  edge,  span  multiple  patches  that  share  a  common 
vertex,  or  have  single  patch  support.  Basis  functions  of  the  first 
variety  go  to  zero  at  the  opposing  vertices  and  are  nonzero  on 
the  common  edge:  basis  functions  of  the  second  variety  go  to 
zero  on  all  edges  not  touching  the  central  vertex  (where  they 
are  nonzero):  basis  functions  of  the  third  variety'  are  zero  on 
the  boundary  of  a  patch  and  nonzero  in  its  interior. 

A.  Scalar  Helmholtz  Modes 

It  Dirichlet  Boundary  Conditions  on  3\Y :  Operating  on 
both  sides  of  (5)  by  Jwdx_fm(xx)  turns  it  into  an  integral 
equation,  which  may  be  written  as 


2)  Neumann  Boundary  Conditions  on  d\V:  The  Neumann 
boundary  condition  demands  that  (e_  •  mnlx_  £  d\\'  - 
0.  If  we  had  basis  functions  whose  values  were  nonzero  or 
the  boundary  but  whose  edge  derivatives  vanished  on  the 
boundary,  we  could  construct  the  modes  directly  from  them, 
just  as  we  did  in  the  Dirichlet  case.  Since  we  donot.  we  need 
to  augment  our  usual  set  of  basis  functions  on  the  interior  of 
H  with  extra  basis  functions  associated  with  the  boundary  of 
H  .  Edge-based  basis  functions  supported  on  the  patch  pairs 
(one  each  from  5  and  IT’)  that  share  a  common  edge  on  d\l 
comprise  this  set. 

The  generalized  eigenvalue  equation  again  derives  from 
(135)  and  (1j6).  In  this  case,  however,  the  unknown  coeffi¬ 
cients  Anrn  also  need  to  obey  the  added  constraint  that  the  edse 
derivative  of  each  eigenmode  must  vanish  on  the  boundary.  We 
may  write  this  constraint  in  integral  form  as 


which,  after  substituting  the  discretized  approximation  for  un . 
becomes 

'y  "  CmAnrn  =0  (141) 


-  J"  dx_fm tx_)(V±  ■  V.,  un(x_)) 

=  (^2  “  3l)  J  dxJ./m(x^)url(xi). 


(134) 


Integrating  the  left-hand  side  by  pans  and  applying  Gauss' 
theorem  to  conven  one  of  the  resulting  surface  integrals  into 
a  boundary  integral,  we  set 


/m(x_)  •  V_un(x_) 

~  f  ^/m(Xj.)(ej.(Xj_)  •  V_Un(XjL)) 

Jd\v 

=  (fc2  -  3n)  J  dx±fm(x±)un(xx)  (135) 

where  e^fxj.)  is  the  unit  edge  normal  to  dW  at  x. .  The 
unit  edge  normal  is  in  the  plane  of  W  and  points  into  the 
waveguide  wall. 

The  Dirichlet  boundary'  condition  demands  that  un(x_^  £ 
3K  )  =  0.  If  we  expand  the  modes  un  in  a  set  of  basis 
functions  fm  that  are  continuous  and  vanish  on  the  boundary 
of  M\  i.e.. 


^ri(X-i-)  —  y  '/ATlTnfrn(x j_)  (136) 
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then  the  boundary  integral  term  vanishes  and  (135)  becomes 
a  generalized  eigenvalue  equation  for  the  mode  coefficients 

£  =  [k-  -  3*)  T  Amm'.4nTn-  (137) 

m‘  ^ , 

m 

where 

Amm' =  j  dxxfm(xx)fm,(xx)  (138) 

Mmm'  =  1^  dx_vxfm(x_)  ■  V_/m,(x_).  (139) 


where 

Cm  —  I  dle_(x_)  ■  v_/m(x_).  (142) 

Jd\Y 

Thus,  we  seek  solutions  to  the  eigenvalue  equation 

_  Lmmi ).4nm-  =  (It*  —  3~)  T  Amm  .4nm.- 

777  ' 

m 

.  (143) 

where 

Lrnm  ~  fdw  *  ^*±/m'(x_))  (144) 

3nd  the  matrices  M  and  A  are  defined  as  in  the  Dirichlet  case, 
subject  to  the  constraint  given  by  (141). 

We  can  subsume  the  constraint  information  directly  into 
the  eigenvalue  equation  by  use  of  the  projection  operator  P 
defined  by 

P=  1  -  CT(CCTr1C  (145) 

where  C  is  given  above  and  1  represents  the  identity  matrix  of 
the  proper  dimensionality.  P  has  the  property  that  it  reproduces 
vectors  x  that  obey  Cx  =  0  and  it  annihilates  vectors  that 
do  not.  P  also  has  the  property  that  the  vectors  x  that 
simultaneously  obey  the  eigenvalue  equation  Qx  =  \x  and 
the  constraint  equation  Cx  =  0,  are  the  same  vectors  that 
obey  the  eigenvalue  equation 

PQPx  =  Xx.  (146) 

Applying  this  to  (143),  we  obtain  the  following  the  generalized 
eigenvalue  equation  for  Neumann  boundary  conditions: 

£  [PA’_1(M  -  L^P}mm'Anm'  =  ( k 2  -  3l)Anjn.  (147 ) 

m' 

Rows  of  A  (i.e.,  eigenvectors)  corresponding  to  eigenmodes 
that  do  not  obey  the  constraint  will  vanish  (to  numerical 
precision)  when  left  multiplied  by  P.  All  such  eigenmodes 
and  eigenvectors  should  be  discarded. 
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B.  Scalar  Laplace  Modes 

seek  solutions  un  i x_ )  that  obey  the  Laplace  equation 

V^n(x^)  =  0  (148) 

inside  II  and  vanish  on  all  boundaries  of  II  except  one 
(call  it  5II’n),  where  we  may  arbitrarily  set  it  to  unit}-.  Since 
our  basis  functions  vanish  on  the  boundary,  we  need  to 
construct  a  special  function  vn(X_)  that  is  continuous  and 
evaluates  to  unity  on  dWn.  For  example,  given  triangular 
patches  parameterized  by  the  three  (nonindependent)  triangle 
coordinates  uj.  un.  and  113.  we  could  take  vn  =  0  on  all 
patches  that  are  not  in  contact  with  the  boundary.  vn  =  u, 
on  all  patches  that  have  the  vertex  u,  =  1  on  the  boundary-, 
and  vn  =  1  -  m  on  all  patches  that  have  edge  u.  =  0  on  the 
boundan-.  Then  we  want  to  approximately  solve 

V2_  \vn(x_)  -  =0.  (149) 

Applying  the  operator  dx_fm(x±)  to  both  sides  and  inte¬ 
grating  the  resulting  equation  by  parts  produces  the  following 
linear  equation  for  the  basis  function  coefficients  .4W  for  the 
potential  function  associated  with  the  nth  boundary: 

y \\Imm’Anm-  =  [  dx_V_fm(x_)-V_vn(x_)  (1501 
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where  M  is  as  defined  in  (139). 

To  make  normalized  TEM  modes  out  of  these  Laplace 
modes,  we  need  them  to  obey 


1=  /  dx_un(x_)  ■  un(x_) 

J\v 

~  /  (t_  )  *  Vj_Un(T_) 

JlV 

=  I  dx_Y±  .  (un(r_)YT  u.n(x_)) 

J IV 

-  [  dX-VlUniXj^) 

J IV 

=  f  dlun(x±){e±(x±) 

JdW 

=  /  dHe±(xx)-'V±un{x±))  (151) 

Jd\\\ 
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which  means  the  coefficients  of  the  discretized  representation 
of  un  must  be  scaled  to  make 

1  ~  L\\- 

=  (152) 

m'  JdW„ 
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Abstract— We  demonstrate  that  a  method  of  moments  scatter¬ 
ing  code  employing  high-order  methods  can  compute  accurate 
values  for  the  scattering  cross  section  of  a  smooth  body  more 
efficiently  than  a  scattering  code  employing  standard  low-order 
methods.  Use  of  a  high-order  code  also  makes  it  practical  to 
provide  meaningful  accuracy  estimates  for  computed  solutions. 

Index  Terms —  Boundary  integral  equation,  electromagnetic 
scattering,  high-order  numerical  method,  method  of  moments. 


I.  Introduction 

A  common  misconception  about  method  of  moments  solu¬ 
tions  to  scattering  problems  is  that  they  cannot  produce 
results  accurate  to  more  than  a  few  decimal  places.  Such  a 
limitation  cannot  be  fundamental.  The  method  of  moments 
technique  results  from  discretizing  an  integral  formulation  of 
the  wave  equation,  which,  in  its  continuous  form,  is  exact.  We 
expect  that  the  solution  to  the  discretized  integral  equation  will 
converge  to  the  solution  of  the  continuous  integral  equation  in 
the  limit  as  the  discretization  scale  size  is  reduced  to  zero,  if 
finite  precision  effects  are  negligible. 

The  problem  with  achieving  high  accuracy  is  not  a  fun¬ 
damental  one  but  rather  a  practical  one.  and  it  stems  from 
the  almost  universal  use  of  low-order  numerical  methods  in 
scattering  codes.  Low-order  numerical  methods,  while  simpler 
to  implement,  suffer  from  the  fact  that  the  computer  resources 
(e.g..  memory  and  CPU  time)  required  to  achieve  a  given 
solution  accuracy  grow  rapidly  as  the  accuracy  requirement 
increases.  Even  for  scatterers  only  a  few  wavelengths  in  size, 
the  computer  resources  required  to  compute  cross  sections 
to  more  than  a  few  digits  of  accuracy  may  be  excessive. 
High-order  methods  are  specifically  designed  to  overcome 
such  limitations  by  reducing  the  incremental  cost  of  accuracy 
improvements. 

FastScat™  is  a  general  purpose,  method  of  moments  scatter¬ 
ing  code  [1]  developed  at  Hughes  Research  Laboratories  (now 
HRL  Laboratories)  that  employs  high-order  methods  in  its 
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current  basis  functions,  quadratures,  and  geometry  description. 
The  focus  of  this  paper  is  on  the  current  basis  functions  and 
how  they  influence  the  convergence  rate  of  computed  cross 
sections  for  two  dimensional  (2-D)  scattering  problems.  We 
will  demonstrate  that  high-order  methods  make  it  practical  to 
achieve  solution  accuracies  limited  only  b\  machine  precision. 
Such  a  demonstration  is  not  merely  of  academic  interest. 
High  accuracies  at  intermediate  stages  of  the  calculation  are 
sometimes  required  to  achieve  even  engineering  accuracies  in 
the  final  result.  Furthermore,  the  ability  to  obtain  accuracy 
improvements  at  relatively  low  cost  has  the  added  benefit 
that  it  becomes  possible  to  obtain  meaningful  estimates  of  the 
accuracy  of  a  computed  solution  [2].  Without  some  estimate 
of  its  accuracy,  a  computed  solution  is  of  limited  usefulness. 

II.  Scalar  Integral  Equations 

The  electromagnetic  scattering  problem  for  a  three- 
dimensional  (3-D)  scatterer  that  is  translationally  invariant  in 
one  direction  can  be  decoupled  into  two  independent  problems, 
each  of  which  is  isomorphic  to  a  two  dimensional  scalar 
scattering  problem  with  a  different  boundary  condition.  In  the 
TM  case,  the  incident  electric  field  is  polarized  parallel  to  the 
axis  of  symmetry':  in  the  TE  case,  it  is  the  incident  magnetic 
field.  The  boundary  conditions  for  the  2-D  scalar  scattering 
problem  corresponding  to  a  perfect  electrical  conductor  (PEC) 
in  3-D  are  Dirichlet  for  TM  polarization  and  Neumann  for 
TE  polarization. 

For  the  TM  polarization  case  [v(x'  on  C)  =  0].  the  electric 
field  integral  equation  for  PEC  boundary  conditions  is 

oinc(x)  =  -j>  dl’ G(x.x’)o(x')  (1) 

where  <z>inc  is  the  incident  field  and  a  is  the  surface  charge 
density.  It  is  defined  as  the  normal  derivative  of  the  total  field 
E  on  the  surface,  i.e.. 

a(x')  =  —n  •  Vy:(x)  (2) 

where  n  is  the  outward  normal  to  the  scattering  surface  at 
xf.  The  integral  is  taken  around  the  contour  C  given  by 
the  intersection  of  the  3-D  scattering  surface  and  a  plane 
perpendicular  to  the  axis  of  symmetry.  The  kernel  G  is  the 
Green  function  of  the  Helmholtz  wave  equation  in  2-D,  namely 

G(x.x’)  =  -H^(k\x-x'\)  (3) 
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where  H^1 1  is  the  zeroth-order  Hanke)  function  of  the  first 
kind  and  k  is  the  wavenumber  of  the  incident  field.  Similarly, 
for  the  TE  polarization  case  knx'  on  C )  =  U'.  the  electric 
field  integral  equation  is 

— h  •  Vomr(x)  =  (h  •  V)  (j>  dlr (n  ‘V’ Gix.x  ))v{x  ).  (4) 


The  correspondence  between  the  scalar  quantities  v  and  n 
and  the  parallel  (to  the  surface)  components  of  the  electric 
and  magnetic  fields  is  given  by 


Ev[x)  =  tli{x)z 
CTix)  . 

H  \X)  —  - - z  x  n 

tuJ(t 


in  the  TM  case  and 


(5) 

(6) 


Hy{x)  =  f(  x)z  (7) 

r.  ,  .  cr(x)  . 

E\\{x)  =  - — n  x  z  (8) 

LjJt 


in  the  TE  case,  where  z  is  the  direction  of  translational  invari¬ 
ance.  h  is  the  surface  normal.  is  the  angular  frequency,  and  e 
and  fj  are  the  dielectric  constant  and  magnetic  susceptibility  of 
the  external  medium,  respectively.  All  fields  implicitly  contain 
the  time  dependence  factor 

A  Galerkin  method  of  moments  solution  [3]  to  the  continu¬ 
ous  scalar  field  equation.  (I).  proceeds  by  first  expanding  the 
unknown  charge  a{x)  in  terms  of  basis  functions  fj(x). 


a(x)  =  Yi  hfM)  (9) 

J 


and  then  testing  the  equation  with  each  of  the  basis  functions 
by  applying  the  operator  <fc  ds'  /,■(*')•  to  both  sides.  The 
result  is  a  matrix  equation  of  the  form 


V  =  ZJ 


(10) 


where 

i;  =  £  dU,™[x).f,(x)  (II) 

and 

Zif  -  dl  j>^  dY  fj(x)G(x.x')fj{x').  (12) 

Similarly,  we  can  discretize  the  scalar  charge  equation.  (4).  by 
expanding  the  unknown  field  as 


r(x)  =  Y  SjJjlx)  (13) 

J 


and  applying  the  testing  operators  to  arrive  at  the  matrix 
equation 


V  =  zs 


where 


dl  [n-  V<!>tnc(x))filx) 


(14) 


(15) 


and 


=  j  dl  f,[x)<n  V)  j  df\n 
=  y  dl  <j  dl*  f,(x) 


■  V’G  x.  x  f, 


k2\n-n  i  -  — —  )G\x.x')Jl\x' 


die)! 


dl 

c  Jc 


i* 


Iba ) 


16b) 


k:(n  •  n)fttx)fj(x')  -  *  4rr  !  G\x,x). 


oi  or 


(16c) 


The  second  form  for  ZtJ  is  like  the  first  in  that  it  requires 
differentiating  the  kernel  twice.  In  the  first  form  thev  are 
normal  derivatives:  in  the  second  they  have  been  convened  to 
tangential  derivatives  by  use  of  the  Helmholtz  equation.  Dif¬ 
ferentiating  the  kernel  exacerbates  the  singularity  of  the  kernel 
at  x  =  x\  which  is  unattractive  from  a  numerical  standpoint 
unless  some  smoothing  operator  is  applied  to  the  kernel  before 
differentiation.  FastScat  uses  a  high-order  regulated  kernel  [4] 
that  is  analytic  everywhere  to  avoid  this  difficulty.  The  third 
form  is  obtained  from  the  second  by  twice  integratinc  by 
pans.  This  reduces  the  singularity  of  the  kernel  to  that  of  the 
Dirichlet  case.  It  does,  however,  require  basis  functions  that 
are  differentiable. 


III.  High-Order  Methods 

FastScat  uses  patch-based  basis  functions  for  both  the  TM 
and  TE  polarization  cases.  That  is  to  say  the  basis  functions 
are  nonzero  only  on  individual  patches.  The  patches  are 
arbitrarily  curved  line  segments  parameterized  by  a  function 
x(u).0  <  u  <  I.  The  basis  functions  are  defined  in  terms  of 
the  surface  parameterization  according  to 

(17) 

As/(j{u) 

where  Pn  is  the  nth  Legendre  polynomial  and 


is  the  metric  for  the  patch  [5].  The  normalization  factors 
are  chosen  to  make  the  basis  functions  orthonormal  when 
integrated  over  a  patch,  i.e.. 

[  dl  fm(x)fn(x) 

J patch 

=  /  du  \Z'a(u)fm(u)fn(u)  =  6mn.  (19) 
./() 

The  contribution  to  the  overall  solution  error  due  to  surface 
misrepresentation  can  be  eliminated  by  internally  representing 
the  surface  using  its  exact  functional  form  [6].  Using  the 
combination  of  high-order  basis  functions  and  an  exact  surface 
representation.  FastScat  can  obtain  a  high-order  approximation 
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to  the  smoothly  varying  source  distribution  that  is  to  be 
expected  on  a  smooth  scattering  surface. 

By  contrast,  standard,  low-order  method  of  moments  im¬ 
plementations  use  flat  segments  to  approximate  the  surface 
geometry  and  basis  functions  that  are  constant  ( in  the  TM  case) 
or  piecewise  linear  (in  the  TE  case)  on  a  patch  to  approximate 
the  sources.  Representing  a  smoothly  curved  scatterer  using 
flat  segments  is  an  example  of  surface  representation  error. 
Using  flat  segments  further  degrades  the  accuracy  of  the  com¬ 
putation  by  introducing  artificial  edges,  which  cause  spurious 
diffraction.  Constant  (or  “pulse")  basis  functions  are  equivalent 
to  the  zeroth-order  basis  functions  in  FastScat:  piecewise 
linear  (or  "rooftop  )  basis  functions  can  be  constructed  from 
FastScat's  zeroth-order  and  first-order  basis  functions.  The 
advantage  of  having  higher  order  polynomial  basis  functions 
is  that  they  can  provide  accurate  approximations  to  smooth 
functions  more  efficiently  than  pulse  or  rooftop  basis  functions 
alone  can. 

The  third  numerical  method  that  must  be  high  order  to 
achieve  high-order  convergence  in  the  final  result  involves  nu¬ 
merical  evaluation  of  integrals  such  as  those  in  (12)  and  (16). 
Gaussian  quadrature  is  a  well-known  high-order  method  for 
evaluating  integrals  of  nonsingular  integrands.  The  impedance 
matrix  elements  of  (12)  and  (16)  fall  into  this  category  when 
the  regions  of  integration  of  x  and  x'  do  not  intersect. 
Such  integrals  may  be  evaluated  efficiently  with  Gaussian 
quadrature  and  typically  are,  even  in  standard  method  of 
moments  codes.  The  trouble  begins  when  the  regions  of 
integration  do  intersect,  as  occurs  when  the  patches  involved 
touch  or  are  the  same.  In  such  cases,  standard  Gaussian 
quadrature  is  reduced  to  the  status  of  a  low-order  method 
[7],  [8].  So-called  "singularity  removal"  (which  is  misnamed 
because,  although  it  removes  the  infinity  in  the  kernel  at 
x  =  x\  it  does  not  eliminate  the  singularity  of  the  kernel  at 
x  =  x'  in  the  strict  mathematical  sense)  is  often  called  upon  to 
handle  such  integrals,  even  though  it  does  not  actually  restore 
the  high-order  behavior  of  Gaussian  quadrature. 

Several  schemes  for  high-order  evaluation  of  singular  inte¬ 
grands  have  been  devised  for  and  implemented  in  FastScat. 
One  involves  using  quadrature  rules  that  are  specific  to  the 
singularity.  For  2-D.  where  the  singularity  of  the  kernel  is 
logarithmic,  high-order  "lin-log"  rules  [9]  have  been  devel¬ 
oped.  They  are  designed  to  exactly  integrate  products  of 
polynomials  and  logarithms.  An  alternate  approach  that  is 
more  easily  extended  to  the  3-D  scattering  case,  involves 
tampering  with  the  kernel  to  eliminate  the  singularity  at  x  = 
x\  but  doing  it  in  such  a  way  that  convolutions  of  the  kernel 
with  polynomial  functions  are  still  computed  exactly  [4].  The 
resulting  function  is  regular  (i.e.,  analytic)— hence,  the  name 
"regulated  kernel".  Convolutions  of  smooth  functions  with  an 
appropriate  regulated  kernel  may  be  evaluated  in  a  high-order 
fashion  by  means  of  standard  Gaussian  quadrature.  Both  of 
these  methods  lead  to  similar  results.  The  calculations  reported 
in  this  paper  were  performed  using  a  high-order  regulated 
kernel  and  Gaussian  quadrature. 

High-order  methods  have  the  potential  to  greatly  improve 
the  efficiency  of  obtaining  accurate  numerical  results.  How¬ 
ever.  like  a  chain  whose  strength  is  limited  by  its  weakest 


link,  the  convergence  rate  of  an  algorithm  whose  final  result 
depends  on  several  numerical  methods,  is  limited  b\  the 
convergence  rate  of  its  lowest  order  method.  For  scattering 
computations,  this  applies  to  the  numerical  methods  used  for 
surface  representation,  basis  functions,  and  quadratures.  To 
show  how  the  method  order  of  one  of  these  components  affects 
the  rate  of  convergence  of  the  full  solution,  it  is  best  to  var\ 
that  one  while  setting  the  method  order  for  each  of  the  other 
two  components  high  enough  that  they  do  not  contribute  an\ 
noticeable  error.  With  FastScat.  the  user  can  control  the  order 
of  each  of  these  three  numerical  methods. 

The  focus  of  this  paper  is  on  high-order  basis  functions  and 
how  they  can  be  employed  to  efficiently  compute  accurate 
results.  Therefore,  the  calculations  summarized  here  show  the 
effect  of  varying  the  basis  function  order  while  using  exact 
surface  representations  and  quadrature  orders  high  enough  that 
numerical  integration  error  was  negligible.  In  normal  usace. 
one  generally  uses  exact  surfaces  and  sets  the  orders  of  the 
basis  functions  and  the  quadratures  to  be  no  higher  than 
necessary  to  achieve  the  desired  accuracy  in  the  final  result. 


IV.  Results 

Measuring  the  order  of  convergence  of  a  numerical  method 
requires  observing  how  the  error  in  the  final  result  responds  to 
changes  in  the  discretization.  For  small  enough  discretization 
scales  K  we  expect  the  error  to  scale  as  e  ~  hn  for  an 
nth-order  numerical  method. 

In  this  next  two  sections,  we  present  results  of  FastScat 
calculations  on  canonical  2-D  geometries  (a  circle  and  an 
ellipse)  that  demonstrate  how  the  rate  of  convergence  varies 
with  discretization  scale  size  and  basis  function  order.  The 
third  subsection  is  devoted  to  a  large  2-D  scattering  seometry 
we  call  the  "bat."  The  bat  is  prototypical  of  scatterers  whose 
cross  section  has  a  large  dynamic  range  as  a  function  of  ansle. 
For  such  scatterers.  the  utility  of  a  high-order  scattering  code 
becomes  evident  even  at  "practical"  accuracies.  Sun  SPARC 
10*s  were  used  for  the  circle  calculations;  the  ellipse  and  bat 
calculations  were  performed  on  IBM  RS/6000  computers. 

A.  Circle 

The  circle  is  one  of  the  best  geometries  to  use  for  investigat¬ 
ing  the  convergence  properties  of  a  scattering  code  because  it 
has  no  geometrical  singularities  (e.g.,  edges  and  corners)  and 
the  answer  can  be  computed  to  arbitrary  accuracy  by  summing 
the  Mie  series.  This  means  that  we  can  determine  exactly  and 
unambiguously  what  the  errors  are  in  our  computed  solutions, 
which  eliminates  one  of  the  sources  of  disagreement  about 
how  to  quantify  solution  accuracy. 

We  used  FastScat  to  compute  the  bistatic  cross  section 
of  lA-radius  circles  for  Dirichlet  and  Neumann  boundary 
conditions,  corresponding  to  TM  and  TE  polarizations,  re¬ 
spectively.  The  circles  were  divided  into  equal  segments, 
each  segment  being  represented  internally  as  a  circular  arc. 
Quadrature  orders  were  set  high  enough  to  guarantee  that 
numerical  integrations  would  be  accurate  to  better  than  one 
part  in  1012. 
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Fig.  2.  Fractional  difference  between  the  cross  section  computed  bv  FastScat 
using  pulse  basis  functions  and  the  exact  cross  section  (Fig.  1 )  as  a  function  of 
observation  angle.  The  curves  are  labeled  by  the  number  of  identical  secments 
into  which  the  ]A-radius  circle  was  divided. 

We  performed  a  series  of  calculations  with  different  basis 
function  orders  and  different  numbers  of  segments,  and  com¬ 
pared  against  the  exact  results  (Fig.  I ).  A  sample  of  the  results 
is  shown  in  Fig.  2  for  the  case  of  zeroth-order  basis  functions 
and  TM  polarization.  The  error  in  the  cross  section  varies  as 
a  function  of  bistatic  scattering  angle.  It  is  evident,  however, 
that,  for  64  or  more  patches,  increasing  the  number  of  patches 
by  a  factor  of  four  reduces  the  overall  error  bv  a  factor  of 
about  64. 

We  can  make  a  stronger  quantitative  statement  about  the 
discretization  error  if  we  condense  the  error  versus  angle 
information  into  a  single  number  for  each  discretization. 
Of  the  many  ways  to  do  this,  we  have  investigated  three: 
maximum  relative  error,  maximum  error  4-  averase  cross 
section,  and  root  mean  square  (rms)  error.  For  this  particular 
problem,  the  result  is  essentially  independent  of  which  measure 
of  error  is  chosen.  Fig.  3  shows  maximum  relative  error 
(niax[|RCS(0)/R.CSref(0)  —  1 1] )  versus  density  of  unknowns 
plotted  on  a  log-log  scale  for  basis  function  orders  zero.  one. 
and  two.  and  numbers  of  patches  ranging  from  four  to  4096. 
Consider  the  TM  polarization  case  first.  The  most  important 
feature  to  note  is  that,  for  enough  unknowns,  the  data  fit  a 
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Fig.  3.  Log-log  plot  of  maximum  relative  error  versus  density  of  unknowns 
for  the  TM  and  TE  polarization  cases.  Each  set  of  points  is  labeled  bv  basis 
function  order. 

linear  trend  line  whose  slope  increases  as  the  basis  function 
order  increases.  Since  the  discretization  scale  h  is  inversely 
proportional  to  the  number  of  unknowns  A\  this  simply  reflects 
the  fact  that  the  error  diminishes  as  //'" .  where  m  increases 
with  method  order.  In  fact,  the  slopes  of  the  lines  connectins 
constant  basis  function  points  are  close  to  integers — three 
for  zeroth-order,  five  for  first-order,  and  seven  for  second- 
order  indicating  that  the  order  of  convergence  of  the  cross 
section  when  using  7/th-order  basis  functions  is  m  ~  2v  +  3. 

On  the  same  plot,  we  also  show  an  example  of  how  the 
surface  model  affects  the  convergence  rate.  The  dashed  curve 
connects  points  that  were  computed  by  replacing  the  circular 
arc  patches  with  flat  patches.  The  order  of  the  quadratures  was 
the  same  as  in  the  previous  case.  For  this  case,  however,  only 
one  basis  function  order  is  shown,  namely  zero.  The  reason 
is  that  the  poor  surface  representation  so  limits  the  rate  of 
convergence  that  increasing  the  order  of  the  basis  functions  has 
essentially  no  effect  on  the  accuracy  of  the  solution.  Curves  for 
higher  basis  function  orders  are  virtual  copies  of  the  zeroth- 
order  result,  shifted  to  higher  numbers  of  unknowns.  In  all 
such  cases,  the  error  in  the  cross  section  is  consistent  with 
scaling. 

In  the  TE  case,  the  slopes  of  the  lines  connecting  constant 
basis  function  points  are  close  to  one  for  zeroth  order,  three 
for  first  order,  and  five  for  second  order,  indicating  that  the 
order  of  convergence  of  the  cross  section  when  using  74th- 
order  basis  functions  is  a-  =  274  +  1.  The  dashed  "curve 
connects  points  computed  according  to  the  standard  method 
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Fig.  4.  Semilog  plot  ot  maximum  relative  error  versus  density  of  unknowns 
tor  TM  scattering  from  a  lUA-radius  circle.  Points  corresponding  to  different 
basis  function  orders  for  a  fixed  patch  size  are  connected  by  lines  and  labeled 
b\  the  number  of  patches. 


Fig.  5.  Log-log  plot  of  maximum  relative  error  versus  total  computation 
time  required  to  calculate  the  bistatic  cross  section  of  a  lUA-radius  circle 
with  TM  polarization.  Points  corresponding  to  different  basis  function  orders 
and  a  fixed  patching  are  connected  by  lines,  which  are  labeled  b>  the  number 
of  equal  arc  length  patches  used. 


of  moments  procedure  for  TE  polarization,  namely,  bv  putting 
rooftop  basis  functions  on  a  faceted  approximation  to  the 
scatterer.  It  converges  more  rapidly  than  do  the  calculations 
that  used  zeroth-order  (i.e..  pulse)  basis  functions  with  an 
exact  geometry  representation.  This  is  not  surprisins  given  that 
currents  modeled  by  rooftop  basis  functions  are  guaranteed  to 
be  continuous  across  patch  boundaries,  whereas  those  modeled 
by  pulse  basis  functions  are  not.  As  in  the  TM  case,  however, 
using  higher  order  basis  functions,  whether  patch-based  or 
edge-based,  does  not  improve  the  order  of  convergence  when 
a  low-order  geometry  representation  is  used.  It  only  increases 
the  number  of  unknowns  used  to  achieve  a  given  accuracy. 
In  all  such  cases,  the  error  in  the  cross  section  is  consistent 
with  Ir  scaling. 

Since  memory  usage  is  proportional  to  N2.  these  plots 
also  show  how  method  order  affects  the  relationship  between 
accuracy  and  memory'  used.  For  errors  less  than  about  10“4 
in  the  TM  case  and  one  in  the  TE  case,  not  only  are  the 
errors  in  the  cross  sections  lower  when  high-order  methods  are 
employed,  but  also  the  marginal  cost  of  additional  accuracy  is 
lower. 

In  the  plots  shown  so  far.  curves  connect  data  points 
corresponding  to  decreasing  patch  sizes  at  a  constant  method 
order.  In  finite  element  terminology  this  is  known  as  ‘71- 
refinement.  As  we  have  seen,  /^-refinement  on  a  smooth 
scatterer  results  in  geometric  convergence  in  the  cross  sec¬ 
tion.  Alternatively,  one  can  take  the  same  data  and  make 
a  plot  by  connecting  points  of  increasing  method  order  for 
a  fixed  patch  size.  This  is  known  as  “p-refinement."  The 
result  of  doing  this  for  bistatic  scattering  from  a  lOA-radius 
circle  and  TM  polarization  is  shown  in  Fig.  4.  The  curves 
tend  toward  straight  lines,  which,  on  a  semilog  plot,  indi¬ 
cates  exponential  convergence.  Exponential  convergence  in  the 
computed  cross  section  is  characteristic  of  ^-refinement  on  a 
smooth  scatterer  when  high-order  polynomial  basis  functions 
are  used. 

Methods  that  achieve  high-order  convergence  in  general, 
and  exponential  convergence  in  particular,  have  obvious  ad¬ 
vantages  for  efficiently  computing  accurate  cross  sections. 


What  may  be  less  obvious  is  the  fact  that  they  facilitate 
accuracy  estimation  for  computed  solutions.  For  example, 
suppose  we  had  not  had  an  independent  means  (such  as  the 
Mie  series  for  a  circle)  for  computing  a  suitably  accurate 
reference  solution.  We  could  still  obtain  an  estimate  of  the 
accuracy  of  a  given  computed  solution  by  comparing  it  to 
a  reference  solution  generated  by  redoing  the  computation 
with  an  even  finer  discretization.  To  be  useful,  however, 
the  reference  solution  must  be  significantly  more  accurate 
than  the  comparison  solution.  Obtaining  a  suitable  reference 
solution  using  low-order  methods  may  require  doubling  or 
quadrupling  the  number  of  patches,  and  hence  the  number 
of  unknowns.  The  additional  cost  of  such  a  calculation  may 
be  so  high  as  to  make  it  impractical.  On  the  other  hand,  gen¬ 
erating  the  reference  solution  by  increasing  the  basis  function 
order  can  produce  a  significantly  better  answer  with  only  a 
modest  increase  in  the  number  of  unknowns.  The  increase  in 
required  memory  and  computation  time  is  likewise  modest.  In 
our  opinion,  the  widespread  reliance  on  low  order  methods 
is  what  accounts  for  the  fact  that  it  is  virtually  unheard 
of  to  see  accuracy  estimates  accompanying  computed  cross 
sections. 

Another  observation  that  may  be  made  from  Fig.  4  is  that 
the  way  to  achieve  a  high  accuracy  result  using  the  least 
memory  (i.e.,  fewest  unknowns)  is  to  make  the  patches  large 
and  put  high-order  basis  functions  on  them.  A  look  at  run 
times  instead  of  unknowns/memory  usage  leads  to  the  same 
conclusion.  Fig.  5  shows  that  for  TM  scattering  from  a  10A- 
radius  circle,  the  total  computation  time  required  to  achieve  a 
given  accuracy  decreases  as  the  number  of  patches  decreases. 
A  point  of  diminishing  returns  is  reached  at  around  16  patches, 
at  which  point  the  arc  length  of  each  patch  is  about  4A.  The 
optimum  distribution  of  patch  sizes  for  an  arbitrary  scatterer 
will  depend  on  its  geometry.  The  general  rule  of  thumb 
that  we  follow  for  patching  smooth  scatterers  is  to  make 
the  patches  about  one  wavelength  long,  except  in  regions 
where  the  geometry  is  strongly  curved.  In  such  regions,  the 
patches  should  be  some  moderate  fraction  of  the  local  radius 
of  curvature. 


68S 


IEEE  TRANSACTIONS  ON  ANTENNAS  AND  PROPAGATION.  VOL  4“  NO  4  \PRH  |g». 


0.01  *— — * - * - * - * _ I 

o  15  30  45  60  75  90 

Angle  (degrees) 


Fig.  6.  Monostatic  cross  section  of  a  20 A  x  2 A  ellipse  (shown  with  32 
patches)  for  TM  polarization. 

B.  Ellipse 

A  good  candidate  geometry  on  which  to  apply  this  rule  of 
thumb  is  the  20A  x  2A  ellipse.  We  can  describe  the  ellipse  by 
the  parametric  equations 

.r=tfcosu  (20a) 

V  =  b  sin  v  (20b) 

where  o  =  10A  and  b  —  1A.  A  sensible  patching,  which 
puts  the  highest  density  of  patches  in  the  most  highly  curved 
regions  and  vice  versa  for  the  flatter  regions,  is  obtained  if  the 
patches  cover  equal  increments  in  the  parameter  u.  as  indicated 
in  the  inset  to  Fig.  6. 

We  used  FastScat  to  compute  the  monostatic  cross  section 
in  TM  polarization  of  a  20A  x  2A  ellipse  using  several  different 
combinations  of  basis  function  order  and  number  of  patches.  In 
all  cases,  an  exact  surface  representation  was  used  to  eliminate 
surface  representation  error,  and  the  quadrature  order  w-as  set 
high  enough  to  guarantee  that  quadrature  error  would  have 
an  insignificant  effect  on  the  final  accuracy.  The  reference 
solution  was  computed  by  putting  tenth-order  basis  functions 
on  an  ellipse  divided  into  160  patches.  Although  we  did  not 
know  the  accuracy  of  the  reference  solution  a  priori .  we  have 
deduced  from  the  convergence  behavior  of  the  comparison 
solutions  that  it  is  at  least  ten  digits.  A  plot  of  the  monostatic 
cross  section  versus  angle  for  the  reference  solution  is  given 
in  Fig.  6. 

Fig.  7  demonstrates  that  one  can  realize  exponential  conver¬ 
gence  in  the  cross  section  by  using  high-order  basis  functions 
with  a  fixed  patching.  In  the  high-accuracy  regime,  memory' 
usage  is  optimized  by  using  large  patches  and  high-order  basis 
functions.  In  the  low-accuracy  regime,  the  accuracy  is  not  that 
sensitive  to  the  discretization  for  a  given  density  of  unknowns. 
The  accuracy  at  which  the  various  curves  tend  to  bunch  up  is 
geometry  dependent,  but.  as  a  general  rule,  can  be  expected 
to  decrease  as  the  problem  size  increases. 

The  analog  to  Fig.  5  for  the  ellipse  is  Fig.  8. 

C.  300A  Bat 

A  bat  is  composed  of  straight  faces  connected  smoothly  by 
circular  arcs  of  radius  R.  There  are  two  long  edges  of  lensth  L 
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Fig.  7.  Semilog  plol  of  maximum  relative  error  versus  densit>  of  unknowns 
for  TM  scattering  from  a  2UA  x  2A  ellipse.  Points  corresponding  to  different 
basis  function  orders  for  a  fixed  patch  size  are  connected  b\  lines  and  labeled 
by  the  number  of  patches. 
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Fig.  8.  Log-log  plot  of  maximum  relative  error  versus  total  compulation  time 
required  to  calculate  the  monostatic  cross  section  of  a  2UA  x  2  A  ellipse  with 
TM  polarization.  Points  corresponding  to  different  basis  function  orders  and 
a  fixed  patching  are  connected  by  lines,  which  are  labeled  by  the  number  of 
patches  used. 


90° 


Fig.  9.  “Bat**  geometry. 

and  six  short  edges,  each  of  length  L/ 3.  at  right  angles  to  each 
other.  The  surfaces  of  the  corresponding  3-D  bat  are  assumed 
to  be  perfect  conductors.  It  is  interesting  from  a  practical 
point  of  view  because  it  has  three  high  cross  section  specular 
reflection  regions  (one  of  which  is  the  2-D  analog  of  a  corner 
cube)  and  a  low  cross  section  everywhere  else  (see  Fig.  9). 
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The  results  shown  here  are  for  R  =  IX.  L  =  300A.  Ffo.  10 
show's  two  computations  of  the  monostatic  cross  section  as  a 
function  of  incidence  angle  for  Dirichlet  boundary  conditions 
(i.e..  TM  polarization).  One  computation  was  performed  usins 
low  -order  basis  functions,  the  other  used  high-order  basis  func" 
tions.  Both  calculations  used  an  exact  surface  representation, 
quadratures  good  to  at  least  eight  digits  of  accuracy,  and 
exactly  6000  unknowns  to  represent  the  sources.  In  the  former 
case,  the  surface  was  broken  up  into  6000  segments,  each 
about  3  A  long,  and  the  sources  were  represented  by  pulse  basis 
functions  (i.e..  one  unknown  per  segment).  This  constitutes  the 
standard,  low-order  procedure  (except  for  the  exact  surface 
representation  used  on  the  circular  arcs)  for  solving  a  2-D 
scattering  problem  with  TM  polarization.  In  the  latter  case,  the 
surface  was  divided  into  1200  patches,  each  about  1A  long,  and 
basis  functions  up  to  fourth-order  were  employed  to  represent 
the  sources  (i.e..  five  unknowns  per  segment). 

The  two  plots  are  very  similar  over  a  good  ponion  of  the 
angular  range,  particularly  in  regions  of  high  cross  section. 
There  are  narrow  peaks  at  45  and  135°  as  expected  and  a 
broader  peak  centered  at  180°.  resulting  from  the  "comer 
square  effect.  Note  that  the  oscillations  evident  in  the  cross 
section  are  the  result  of  interference,  not  due  to  any  solution 
error.  However,  in  the  angular  ranges  from  0  to  30c  and  60 
to  120°.  there  are  significant  disagreements.  The  "spikes”  in 
the  upper  plot  Fig.  10  are  suspicious  looking.  Which  is  risht? 
How  can  one  be  sure? 

Having  high-order  methods  at  one’s  disposal  makes  it 
possible  to  answer  these  questions  with  the  kind  of  certainty 
that  is  impractical  to  attain  with  low-order  methods.  If  we 
keep  the  same  patching  of  the  bat,  but  allow  up  to  fifth-order 
basis  functions  instead,  the  number  of  unknowns  increases  to 
7200.  This  corresponds  to  a  44%  increase  in  the  amount  of 
memory  required  to  store  the  impedance  matrix  and  a  73% 
increase  in  the  amount  of  CPU  time  required  to  LU  decompose 


the  impedance  matrix  (which  is  the  most  time-consumina  step 
in  the  solution  process).  More  importantly,  allowing  for  one 
higher  polynomial  order  to  represent  the  sources  improves  the 
accuracy  of  the  solution  significantly.  So  much  so  that  we  are 
justified  in  using  the  fifth-order  solution  as  a  reference  solution 
against  which  we  can  compare  the  lower-order  solutions  in 
order  to  estimate  their  accuracies.  To  compute  a  reference 
solution  of  comparable  accuracy  by  the  standard,  low-order 
technique  would  require  subdividing  the  6000  patches  many 
times  into  smaller  patches.  The  number  of  unknowns  would 
increase  significantly.  In  principle,  it  could  be  done,  but  since 
CPU  time  for  LU  decomposition  and  memory  for  impedance 
matrix  storage  scale  so  badly  with  number  of  unknowns, 
the  cost  would  be  so  exorbitant  as  to  make  the  procedure 
impractical. 

Fig.  1 1  shows  plots  of  the  differences  between  the  fifth- 
order  reference  solution  and  the  two  solutions  plotted  in 
Fig.  10.  It  is  evident  that  the  fourth-order  solution  is  the  better 
of  the  two.  As  expected,  the  error  is  least  where  the  cross 
section  is  highest.  The  estimated  error  of  the  fourth-order 
solution  is  generally  below  I0~3A:  at  a  few  angles  it  rises 
to  almost  10  2 A.  If  error  bars  were  to  be  plotted  on  the  Mail¬ 
order  data  of  Fig.  10.  they  would  all  be  less  than  the  thickness 
of  the  plotted  line.  Fig.  1 1  also  shows  the  estimated  error 
of  the  low-order  solution  to  be  generally  higher.  Whereas  it 
is  probably  acceptable  over  angular  regions  where  the  cross 
section  is  high,  in  the  low  cross  section  region  the  error  cannot 
be  considered  acceptable,  exceeding,  as  it  does,  20  dB  for 
certain  angles.  Similar  results  obtain  for  TE  polarization. 


V.  Summary 

The  unfavorable  tradeoff  between  cost  and  problem  size  for 
method  of  moments  solutions  to  scattering  problems  is  well 
known  and  several  so-called  “fast”  methods,  such  as  the  fast 
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Fig.  II.  RCS  error  with  respect  to  reference  solution  computed 
calculation:  lower  curve:  fourth-order  calculation. 


with  fifth-order  basis  functions  on  3  A  patches  (7200  unknowns).  Upper  cuive:  zeroth-order 


multipole  method  [10].  have  been  devised  in  recent  years  to 
address  it. 

The  subject  of  this  paper  is  another  tradeoff  that,  while  no 
less  important,  is  apparently  much  less  widely  appreciated.  It 
is  the  tradeoff  between  cost  and  accuracy  for  a  fixed  problem 
size.  Improving  the  accuracy  of  a  computed  solution  requires 
refining  the  discretization,  which  in  turn  requires  more  memorv 
and  more  computation  time.  With  low-order  methods  the 
amount  of  additional  computer  memorv  and  time  required  to 
achieve  a  more  accurate  result  may  be  substantial.  High-order 
methods  are  designed  to  make  accuracy  improvements  much 
less  costly. 

The  focus  in  this  paper  has  been  on  usinc  hich-order  basis 
functions  to  compute  cross  sections  in  2-D.  High-order  basis 
functions  are  part  of  the  triad  of  high-order  methods  that 
make  FastScat  a  high-order  scattering  code.  The  results  show’ 
that  by  using  high-order  methods  it  is  possible  to  achieve 
very  accurate  solutions  to  simple  scattering  problems  on  a 
workstation  in  a  reasonable  amount  of  time.  Furthermore,  w'e 
have  demonstrated  that  the  solution  converges  at  a  geometric 
rate  as  a  function  of  patch  size  for  fixed  basis  function  order 
and  exponentially  as  a  function  of  basis  function  order  for 
fixed  patch  size.  For  high  accuracies,  the  most  computationally 
efficient  solutions,  in  terms  of  both  memorv'  and  CPU  time, 
are  produced  by  using  high-order  basis  functions  on  larce 
patches. 

High-order  methods  are  important  for  doing  larse  problems 
as  well.  In  fact,  the  adverse  effects  of  a  low-order  discretization 
are  likely  to  manifest  themselves  even  more  prominently  as 
problems  grow  in  size.  The  error  caused  by  a  low-order 
discretization  will  be  particularly  noticeable  on  scatterers 
w'hose  cross  section  has  a  large  dynamic  range  as  a  function 
of  angle.  We  devised  a  large  2-D  scatterer  called  the  bat 
in  order  to  demonstrate  this  effect.  We  observed  that  where 
the  cross  section  is  high,  solutions  computed  using  low-order 


and  high-order  basis  functions  w'ere  about  the  same,  whereas 
in  the  more  interesting  regions  w’here  the  cross  section  is 
low.  the  high-order  solution  is  accurate  while  the  low-order 
solution  has  significant  errors.  Had  we  used  a  low-order 
surface  representation  the  result  would  likely  have  been  w'orse 
still.  The  bat  also  demonstrated  the  practical  utility  of  hich- 
order  methods  for  estimating  the  accuracy  of  a  computed 
solution. 
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We  show  how  to  solve  time-harmonic  scattering  problems  by  means  of  a  high- 
order  Nystrom  discretization  of  the  boundary  integral  equations  of  wave  scattering 
in  2D  and  3D.  The  novel  aspect  of  our  new  method  is  its  use  of  local  corrections  to 
the  discretized  kernel  in  the  vicinity  of  the  kernel  singularity.  Enhanced  by  local  cor¬ 
rections.  the  new  algorithm  has  the  simplicity  and  speed  advantages  of  the  traditional 
Nystrom  method,  but  also  enjoys  the  advantages  of  high-order  convergence  for  con¬ 
trolling  solution  error.  We  explain  the  practical  details  of  implementing  a  scattering 
code  based  on  a  high-order  Nystrom  discretization  and  demonstrate  by  nume.-ical 
example  that  a  scattering  code  based  on  this  algorithm  can  achieve  high-order  con¬ 
vergence  to  the  correct  answer.  We  also  demonstrate  its  performance  advantages  over 
a  high-order  Galerkin  code.  c  1 99s  Academic  Press 

Key  Words:  high-order  numerical  method;  Nystrom  method;  boundary  integral 
equation;  Nystrom  discretization;  local  corrections;  acoustic  scattering;  electromag¬ 
netic  scattering. 


I.  INTRODUCTION 

High-order  methods  are  numerical  methods  characterized  by  their  ability  to  obtain  extra 
digits  of  precision  with  comparatively  small  additional  effort.  Scattering  codes  that  employ 
high-order  methods  have  a  distinct  advantage  over  scattering  codes  that  use  low-order  meth¬ 
ods  when  it  comes  to  computing  results  accurately.  We  demonstrated  this  advantage  with 
a  Galerkin  method  of  moments  scattering  code  called  FastScat™  [1,2],  which  employs 

1  This  research  was  supported  by  the  Defense  Advanced  Research  Projects  Agency  of  the  U.S.  Department  of 
Defense  under  Contract  MDA972-95-C-002 1  and  by  the  Hughes  Electronics  Corporation. 
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high-order  methods  in  its  geometry  description,  current  basis  functions,  and  quadratures.  In 
terms  of  memory  efficiency,  the  advantage  of  using  a  high-order  code  such  as  FastScat  was 
clear.  For  a  given  number  of  unknowns,  results  obtained  with  FastScat  were  generallv  more 
accurate  than  those  obtainable  by  low-order  codes,  with  the  accuracy  gap  widening  rapidlv 
as  the  number  of  unknowns  applied  to  the  problem  was  increased.  In  terms  of  CPU  time  effi¬ 
ciency.  however,  the  advantage  of  using  a  high-order  code  such  as  FastScat  was  not  so  clear. 
The  precomputation  phase  of  the  calculation  often  accounted  for  an  undesirablv  large  frac¬ 
tion  of  the  total  solution  time.  Although  we  were  able  to  significantly  accelerate  the  part  of  the 
precomputation  phase  devoted  to  computing  near-interaction  matrix  elements  by  using  high- 
order  regulated  kernels  [3].  the  overall  matrix  fill  procedure  was  still  considered  too  slow. 

The  precomputation  phase  of  a  Galerkin  scattering  calculation  is  time  consuming  because 
it  requires  numerical  evaluation  of  the  convolution  of  the  kernel  with  basis  functions  on  even 
pair  of  source  and  field  patches.  This  amounts  to  N2  numerical  double  integrations  over 
patches,  where  N  is  the  number  of  unknowns.  By  contrast,  when  a  point-based  (Nystrom) 
discretization  is  used,  the  impedance  matrix  fill  step  consists  of  nothing  more  than  a  kernel 
evaluation  to  fill  most  matrix  elements  and  O(N)  single  integrations  and  some  low-rank 
linear  algebra  to  fill  the  others  (specifically,  the  near  interactions).  As  a  result,  use  of  a 
point-based  discretization  dramatically  reduces  precomputation  time. 

Despite  its  simplicity  and  speed  advantages,  the  Nystrom  method  has  not  been  widely 
used  for  discretizing  the  integral  equations  that  arise  in  2D  and  3D  scattering  problems. 
In  fact,  we  know  of  only  a  few  reported  instances,  of  which  [4.  5]  are  examples.  The 
problem  is  that  the  conventional  Nystrom  method  [6]  is  designed  to  handle  regular  kernels, 
whereas  the  Helmholtz  kernel  for  wave  scattering  is  singular  wherever  the  source  point 
coincides  with  the  field  point.  The  standard  way  [6]  to  try  to  overcome  this  problem  is  to 
use  so-called  “singularity  extraction.**  which,  in  practice,  removes  the  infinity  in  the  kernel 
but  not  the  singularities  in  the  kernel's  derivatives.  While  singularity  extraction  avoids  the 
dilemma  caused  by  numerical  evaluation  of  the  kernel  at  infinities,  it  does  not  generalize 
easily  to  arbitrary  surface  patch  geometries  and  it  is  a  low-order  method.  In  this  paper,  we 
introduce  “local  corrections'*  as  a  means  to  overcome  the  problems  associated  with  kernel 
singularities.  This  enhanced  Nystrom  discretization  method  has  all  the  advantages  of  the 
standard  Nystrom  method  combined  with  the  high-order  convergence  capability  required 
to  achieve  error  control. 

This  paper  provides  a  detailed  explanation  for  using  the  Nystrom  method  to  solve  scat¬ 
tering  problems  in  the  2D  and  3D  scalar  cases  and  the  3D  vector  case  (by  which  we  mean 
electromagnetic  scattering  based  on  the  Maxwell  equations),  as  well  as  numerical  evidence, 
demonstrating  the  method’s  utility.  The  first  section  reviews  the  traditional  Nystrom  method 
for  discretizing  integral  equations  and  explains  how  it  can  be  adapted  to  handle  singular 
kernels  by  incorporating  local  corrections.  The  second  section  discusses  practical  aspects  of 
implementing  a  high-order  Nystrom  code,  such  as  appropriate  surface  models  and  meshes, 
choice  of  testing  functions  for  computing  local  corrections,  and  how'  to  compute  scattering 
results.  In  the  fourth  section,  we  show  numerical  results  for  some  2D  and  3D  canonical 
scatterers  to  demonstrate  that  our  implementation  of  the  Nystrom  method  achieves  high- 
order  convergence  to  the  correct  answer.  We  also  demonstrate  the  run-time  performance 
benefits  of  a  using  high-order  Nystrom  code,  compared  to  high-  and  low-order  Galerkin 
codes,  in  this  section.  Finally,  the  Appendix  describes  howr  the  local  correction  intesrals  for 
2D  scalar.  3D  scalar,  and  3D  electromagnetic  scattering  can  be  formulated  for  efficient  and 
accurate  numerical  evaluation. 
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II.  NYSTROM  METHOD 

A.  Conventional  Ny strom  Method 

The  conventional  Nystrom  method  is  a  simple  and  efficient  mechanism  for  discretization 
of  integral  equations  with  nonsingular  kernels.  Consider  the  integral  equation 

4>{x)  =  ds'G(\  -  x)\f/(x)  (1) 

and  a  quadrature  rule  for  integrating  a  function  f{\)  over  the  region  5 

[  dsf(x)  =  y^,/(x„).  (2) 

Js  i  • 

Such  a  quadrature  rule  will  be  provided  by  Gauss-Legendre  or  Gauss-Jacobi  rules  on  a 
parameterization  of  S ,  so  that  the  weights  con  will  be  the  products  of  the  elementary  weights 
w„  with  the  Jacobian  of  the  parameterization: 

con  =  y/g{un)wtl.  (3) 

x„=x{u„).  (4) 

where  un  are  the  abscissae  of  the  elementary  rule.  x(u)  is  the  mapping  function  of  the 
surface  S,  and  g(u)  is  the  determinant  of  the  mapping  metric.  The  extension  to  patched 
parameterizations  is  straightforward. 

The  Nystrom  discretization  of  a  function  on  S  is  simply  the  tabulation  of  the  function  at 
the  quadrature  points  x„: 


xf/n  =  0(x„).  (5) 

To  discretize  integral  Eq.  (1 ),  we  simply  form  a  matrix  from  the  kernel: 

K 

tpm  —  ^  ^  Cj  {Xm  Xn)\j/n.  (6) 

/;=  1 

This  discretization  has  an  error  of  the  same  order  as  the  underlying  quadrature  rule  [7]. 
In  other  words,  if  the  surface  S  is  smooth,  0  and  G(x  —  x')  are  regular  functions,  and  if 
a  high-order  quadrature  rule  is  used,  then  the  solution  to  Eq.  (6)  represents  a  high-order 
approximation  to  the  exact  solution. 

Unfortunately,  the  kernels  G(x  —  x')  for  wave  scattering  are  not  regular.  Instead,  they 
have  singularities  (or  even  hypersingularities)  at  short  distances.  With  such  kernels  it  is 
often  not  even  possible  to  make  a  matrix  out  of  the  kernel  because  its  value  is  undefined 
when  x  =  x'.  Even  if  the  kernel  were  finite  at  vanishing  separation,  a  kernel  singular  in  its 
higher  derivatives  would  spoil  the  high-order  properties  of  the  above  prescription. 

B.  High-Order  Nystrom  Method  for  Singular  Kernels 

We  have  adapted  the  Nystrom  method  to  handle  singular  kernels,  without  sacrificing  high- 
order  convergence,  by  incorporating  Strain’s  method  [8]  for  obtaining  high-order  quadrature 
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rules  for  singular  functions.  The  essence  of  the  method  is  that  by  computing  convolutions 
of  the  kernel  with  a  suitable  set  of  testing  functions,  it  is  possible  to  determine  how  to  adjust 
the  quadrature  rule  so  that  it  is  just  as  accurate  near  the  singularity  as  far  from  it.  The  beauty 
of  the  method  is  that  these  quadrature  rule  modifications  are  required  onlv  in  the  vicinity 
of  the  singularity,  hence  the  name  local  corrections. 

Conceptually,  local  corrections  may  be  viewed  as  adjustments  to  the  quadrature  weights 
(at  the  original  set  of  sample  points)  that  are  required  to  make  the  quadrature  rule  high-order 
accurate  when  the  (singular)  function  G(\  —  \  )  is  included  in  the  integrand.  In  practice, 
since  quadrature  weights  and  discretized  kernel  terms  always  enter  into  the  quadrature  rule 
as  product  pairs,  one  can  equally  w  ell  "locally  correct"  the  discretized  representation  of 
kernel  and  keep  the  original  quadrature  weights.  This  is  the  preferred  approach  because  the 
modified  representation  of  the  kernel  has  no  infinities.  We  can  write  the  "corrected"  matrix 
representation  of  the  kernel  as 

^  _//.»!«•  whenx„eD,„. 

\  G(x„,  —  x„ ).  otherwise. 

where  Lmn  is  a  (sparse)  matrix  of  local  corrections  whose  entries  are  nonzero  only  for  source 
points  x„  within  a  small  domain  D,„  centered  on  the  field  point  x,„.  For  |x„,  -  x'|  sufficiently 
large  (i.e..  outside  the  local  correction  domain  £>„,),  G(x,„-x')  is  a  smoothly  varying 
function  of  position  and  the  underlying  quadrature  rule  provides  a  hiah-order  approximation 
to  the  desired  integral.  Close  to  the  singularity,  on  the  other  hand,  the  singular  nature  of 
the  kernel  spoils  the  high-order  behavior  of  the  underlying  quadrature  rule,  and  it  becomes 
necessary  to  use  locally  corrected  values  for  the  kernel  instead  of  G(x,„  -  x„)  in  order  to 
achieve  high-order  convergence.  The  mechanism  for  computing  the  local  corrections  for 
a  given  set  of  source  points  is  explained  below.  The  size  of  the  local  correction  domain  is 
discussed  in  Section  III.D. 

The  underlying  quadrature  rule  is  exact  for  integration  of  a  certain  class  of  functions 
(typically  polynomials).  We  choose  the  local  corrections  to  make  convolution  of  the  singular 
kernel  with  the  same  class  of  functions  exact.  They'  are  obtained  by'  solvins  the  linear  system 

y^to„L„„,/ai(x,„  -  x„)  =  /  ds'G(xm  —  x')fa'(xm  —  x).  (8) 

n 

which  represents  K  constraints  (one  for  each  testing  function  /•*')  on  J  local  correction 
coefficients  (one  for  each  of  ./  source  points  in  the  vicinity  of  the  wth  field  point).  The 
integral  over  Dm  can  be  obtained  by  oversampling  the  region  of  integration  until  the  result 
has  converged  to  the  desired  accuracy.  The  nonzero  components  of  the  /??th  row  of  the  local 
correction  matrix  are  obtained  by  inverting  the  (small)  system  of  equations  above,  either  by 
factorization  (via  LU  decomposition)  if  J  =  K  or  by  singular  value  decomposition  (SVD)  if 
*/  ^  K .  Computing  local  corrections  is  the  most  time  consuming  step  of  the  precomputation 
phase.  Fortunately,  it  needs  to  be  done  only  once  at  every  sample  point. 

C.  High-Order  Nystrom  Method  Advantages 

There  are  several  reasons  for  using  the  Nystrom  method  to  achieve  a  high-order  dis¬ 
cretization: 
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•  Faster  precomputation.  Unlike  the  Galerkin  method,  which  requires  A';  numerical 
double  integrations  to  fill  the  impedance  matrix,  the  Nystrom  method  requires  less  than  X2 
kernel  evaluations  and  0(N )  calculations  of  local  correction  coefficients  (each  of  which 
involves  a  small  number  of  adaptive  integrations  and  a  low-rank  matrix  inversion).  An  addi¬ 
tional  acceleration  is  possible  when  multiple  solutions  are  desired  at  different  frequencies. 
This  comes  about  because  a  frequency-dependent  Helmholtz  kernel  can  be  written  as  the 
product  of  a  smoothly  varying,  frequency-dependent  function  and  a  frequency-independent 
Laplace  kernel.  Once  the  local  corrections  for  the  Laplace  kernel  have  been  computed,  thev 
can  be  used  with  minor  modification  at  any  frequency. 

•  Elimination  of  multipatch,  parametric  basis  functions.  Conventional  method  of  mo¬ 
ments  scattering  codes  require  basis  functions  with  a  certain  level  of  continuity  (in  the 
surface  parameterization)  across  patch  boundaries  to  facilitate  differentiation.  For  example, 
an  important  property  of  the  popular  RWG  [9]  basis  functions  for  electromagnetic  scat¬ 
tering  is  that  their  normal  components  are  continuous  across  patch  boundaries.  One  can 
also  use  high-order  extensions  to  the  RWG  basis  functions  [10],  although  we  have  found 
that  implementing  these  basis  functions  in  a  scattering  code  can  be  both  complicated  and 
inconvenient,  especially  for  arbitrary,  curved  surfaces.  Fortunately,  for  high-order  codes 
the  requirement  to  use  elemental  sources  with  guaranteed  continuity  between  patches  dis¬ 
appears  because  continuity  of  the  source  distribution  is  achieved  as  a  natural  consequence 
of  accurately  solving  the  integral  equation.  (The  reason  this  is  so  has  to  do  with  the  fact 
that  the  error  caused  by  not  enforcing  continuity  of  the  elemental  sources  is  comparable 
to  the  error  of  the  underlying  discretization.  With  a  low-order  discretization  (e.g..  RWG 
basis  functions  on  flat  patches),  continuity  enforcement  has  a  significant  payoff  because 
the  error  in  the  underlying  discretization  is  also  significant.  With  a  high-order  discretiza¬ 
tion.  where  the  error  due  to  the  underlying  discretization  can  more  easily  be  made  in¬ 
significant.  the  situation  is  reversed.  Thus,  for  high-order  codes,  whether  Galerkin  or 
Nystrom.  the  benefits  of  enforcing  source  continuity  between  patches  do  not  outweigh  the 
inconveniences.) 

•  More  amenable  to  fast  solution  algorithms.  Implementation  of  a  fast  method  that 
requires  segregation  of  the  discretized  scatterer  into  groups  (such  as  the  fast  multipole 
method  (FMM)  [1 1]  or  adaptive  integral  method  (AIM)  [12])  is  simpler  and  more  natural 
with  a  point-based  discretization.  When  a  Galerkin  implementation  with  overlapping  basis 
function  domains  is  employed,  the  fast  algorithm  is  either  more  complicated  (because  multi¬ 
patch  basis  functions  must  be  split  apart)  or  less  efficient  (because  the  groups  are  larger). 
A  Galerkin  implementation  that  uses  high-order  basis  functions  (even  those  confined  to 
single  patches)  cannot  achieve  optimum  efficiency  from  the  FMM  because  high-order  basis 
functions  are  used  to  their  greatest  advantage  on  patches  larger  than  a  wavelength,  whereas 
optimum  use  of  the  FMM  favors  groups  smaller  than  a  wavelength.  In  a  Nystrom  discretiza¬ 
tion,  the  groups  consist  of  individual  sample  points  on  the  surface,  so  no  such  grouping 
restrictions  apply. 

•  Iterative  solver  memory  reduction.  With  the  Nystrom  method,  the  memory'  requirement 
for  an  iterative  solver  using  the  full  impedance  matrix  can  be  reduced  from  0(N2)  (storing 
the  full  impedance  matrix)  to  0(N )  (storing  only  the  sparse  local  correction  matrix).  This 
is  practical  because  reconstruction  of  the  unsaved  portions  of  the  impedance  matrix  only 
requires  evaluations  of  the  kernel,  which  are  fast.  If  the  FMM  is  used  to  represent  the  far 
interactions,  the  storage  requirement  goes  from  0(N5/4)  in  the  single-stage  case  [13]  to 
0(N  log(AO)  in  the  multilevel  case  [14]. 
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•  Symmetry  exploitation.  When  basis  functions  are  used,  it  is  more  complicated  to  re¬ 
flect  geometrical  symmetries  in  the  matrix  representation.  It  may  be  necessary  to  explicitlv 
consider  basis  function  transformation  properties  and  to  provide  special  treatment  for  some 
variables  (e.g..  the  coefficients  of  basis  functions  whose  domains  intersect  reflection  planes). 
In  the  Nystrom  case,  the  representation  of  symmetries  is  much  simpler. 

III.  PRACTICAL  CONSIDERATIONS 


A.  Surface  Description 

Without  a  high-order  surface  description,  a  high-order  Nystrom  discretization  is  of  little 
benefit.  For  example,  representing  a  curved  surface  by  means  of  flat  facets  limits  the  rate  of 
solution  convergence  to  low  order  whether  or  not  the  rest  of  the  discretization  method  is  high 
order.  Ideally,  the  internal  representation  of  the  surface  exactly  matches  the  physical  surface. 
Such  a  representation  is  possible  for  idealized  curved  shapes  such  as  circles,  ellipses,  ogives, 
etc.  in  2D.  and  spheres,  ellipsoids,  etc.  in  3D.  For  curved  objects  of  more  practical  interest, 
a  high-order  description  of  the  physical  surface  may  be  given  by  high-order  parametric 
representations  such  as  bicubic  splines  or  NURBS  (nonuniform  rational  B-splines).  As 
these  are  often  the  representations  used  by  a  CAD  program  to  describe  the  object  as  it  is 
being  designed  and  built,  it  is  appropriate  that  we  should  also  use  them  for  electromagnetic 
or  acoustic  modelling  purposes. 

Use  of  a  high-order  surface  description  is  distinguished  from  that  of  a  faceted  description 
in  that  the  subdivision  of  the  surface  into  patches  is  typically  done  once  and  refining  the 
discretization  to  improve  accuracy  is  accomplished  by  increasing  the  order  of  the  quadrature 
rule  (w’hich  increases  the  number  of  sample  points  per  patch). 

B.  Meshing 

The  essence  of  a  point-based  discretization  is  the  tabulation  of  functions  at  a  set  of  points 
lying  on  the  surface.  This  need  not  have  anything  to  do  with  subdividing  a  surface  into 
patches.  Indeed,  in  the  2D  case,  patches  can  be  done  away  with  entirely  on  closed  surfaces 
(i.e.,  closed  curves)  parameterized  by  arc  length,  because  the  trapezoidal  rule  is  a  high- 
order  quadrature  rule  for  periodic  functions.  In  3D.  however,  global  parameterizations  with 
natural,  high-order  quadrature  rules  are  much  harder  to  come  by.  so  subdivision  of  a  surface 
into  patches,  each  of  which  comes  with  its  own  high-order  quadrature  rule,  becomes  a 
practical  necessity. 

Since  patches  are  introduced  solely  for  the  purpose  of  providing  ready-made,  high-order 
quadrature  rules  on  the  surface,  the  job  of  meshing  a  surface  is  simpler  and  less  restrictive. 
Specifically,  whereas  a  mesh  designed  for  use  with  RWG-type  basis  functions  is  not  allowed 
to  have  a  vertex  in  the  middle  of  an  edge,  there  is  no  such  restriction  on  a  mesh  designed 
for  a  point-based  discretization.  The  only  practical  restrictions  are  that  the  mesh  cover  the 
surface  and  that  the  patches  not  be  so  distorted  or  curved  that  the  supposedly  high-order 
quadrature  rules  are  not  actually  high  order. 

C.  Testing  Functions 

The  choice  of  testing  functions  goes  together  with  the  choice  of  quadrature  rule.  If  the 
quadrature  rule  is  designed  to  efficiently  integrate  regular  functions,  the  testing  functions 
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should  be  regular  functions  of  increasing  order.  In  locations  where  singular  behavior  of  the 
source  function  is  expected,  such  as  near  geometric  singularities  (e.g..  edges  and  comers), 
it  may  be  desirable  to  apply  a  different  quadrature  rule  and  use  appropriately  singular 
testing  functions  [15].  For  purposes  of  this  discussion,  we  will  assume  the  scattering  surface 
and  the  sources  are  smooth  functions  of  position.  Any  departures  from  regularity  can  be 
accommodated  reasonably  efficiently  by  tapering  the  size  of  the  patches  in  the  direction  of 
the  singularity. 

Testing  functions  may  be  global  or  local.  Examples  of  global  testing  functions  are  mono¬ 
mials  in  the  surface  parameter  u  in  the  2D  case,  and  powers  of  a  .  y .  and  r  in  the  3D  case.  The 
advantage  of  using  global  testing  functions  to  compute  local  corrections  on  smooth  surfaces 
is  that  such  testing  functions  are  manifestly  continuous  across  patch  boundaries,  just  like 
the  sources.  Sometimes  enforcing  continuity  is  a  mistake,  however,  such  as  when  the  field 
point  and  source  patch  are  near  each  other  but  on  separate,  unconnected  surfaces.  Global 
testing  functions  can  also  perform  badly  near  geometric  singularities  such  as  a  right-angle 
bend.  Local  testing  functions  (i.e..  testing  functions  confined  to  individual  patches)  do  not 
take  full  advantage  of  the  guaranteed  continuity  of  the  sources  on  touching  patches  but  are 
the  preferred  choice  because  they  are  simpler  to  implement  and  more  robust. 

With  local  testing  functions,  the  local  corrections  for  a  given  field  point  can  be  computed 
on  a  patch  by  patch  basis.  Thus,  the  number  of  points  whose  quadrature  weights  are  being 
corrected  always  equals  the  number  of  sample  points  on  the  patch.  Doing  this  has  the  side 
benefit  of  keeping  down  the  size  of  the  local  correction  linear  systems  that  must  be  solved 
when  it  becomes  necessary  to  compute  local  corrections  for  points  on  several  patches. 

The  number  of  local  testing  functions  to  use  is  still  a  free  parameter.  In  2D.  where  use  of 
a  Gauss-Legendre  rule  of  order  M  allows  exact  integration  of  polynomials  up  to  order  2 M 
(i.e..  degree  2M  -  1),  it  makes  sense  to  use  as  many  testing  functions  as  there  are  points  to 
locally  correct.  In  effect,  the  singular  kernel  and  the  unknown  source  function  are  both  being 
approximated  to  order  M,  which  means  the  order  of  approximation  for  the  product  is  2  M . 
This  results  in  an  exactly  determined  system  of  equations  for  computing  local  corrections. 

In  3D.  if  a  Gauss-Legendre  product  rule  of  order  MxMy  is  used  on  quadrilateral  patches, 
the  natural  number  of  local  testing  functions  to  use  is  4MvMy.  This  leads  to  an  exactly 
determined  system.  If  the  patches  are  triangles,  one  can  use  the  quadrature  rules  of  Lyness 
and  Jespersen  [16]  and  their  higher-order  extensions.  For  these  triangle  rules,  a  natural 
correspondence  between  the  number  of  sample  points  and  the  maximum  testing  function 
degree  is  less  obvious.  When  the  number  of  sample  points  and  the  number  of  testing 
functions  are  not  the  same,  they  can  at  least  be  made  close,  in  which  case  the  nonsquare  linear 
system  of  equations  for  the  local  corrections  can  be  solved  by  computing  a  pseudoinverse 
using  SVD.  In  our  experience,  local  correction  systems  that  are  square  or  nearly  square 
perform  best. 

C.l .  Two-dimensional  scalar  testing  functions.  Monomials  of  increasing  degree  in  the 
parameterization,  i.e..  fik)(u)  =  uk.  are  the  simplest  testing  functions,  but  they  can  also  be 
troublesome  when  using  high-order  rules  because  they  produce  linear  systems  for  computing 
local  corrections  whose  condition  number  grows  exponentially  wdth  degree.  The  alternative 
we  favor  is  orthogonal  polynomials  such  as  Legendre  or  Lagrange  polynomials.  With  either 
of  these  polynomials  as  testing  functions,  it  takes  a  little  longer  to  compute  the  integral  on 
the  right-hand  side  of  Eq.  (8).  but  the  linear  system  is  well  conditioned  for  all  polynomial 
degrees.  In  addition,  if  the  number  of  testing  functions  K  equals  the  number  of  source 
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points  whose  quadrature  w  eights  are  being  corrected  ./.  then  the  system  is  orthogonal  and 
the  matrix  consisting  of  the  K  testing  functions  evaluated  at  the  ./  different  source  points 
can  be  inverted  simply  by  transposition. 

C.2.  Three-dimensional  scalar  resting  functions.  The  trade-off  between  the  simplicity 
of  monomials  and  the  better  conditioning  behavior  associated  with  orthogonal  polynomials 
exists  also  in  the  3D  cases.  In  3D.  however,  our  experience  have  been  confined  to  testing 
functions  of  a  low  enough  degree  that  use  of  monomial  functions  generally  does  not  pose 
any  serious  trouble.  On  triangular  patches,  we  use  testing  functions  of  the  form 

fafu)  =  (u'Y"(ir)'!.  (9) 

where  w1  and  it2  are  the  parameters  of  the  surface  description  and  the  exponents  obey 
0  <  in .  n  <  M  and  0  <  /;;  +  n  <  M  for  some  maximum  testing  function  degree  M . 

C.3.  Three-dimensional  vector  testing  functions.  In  this  case,  vector  testing  functions 
locally  tangent  to  the  surface  are  required:  continuity  of  the  testing  functions  between 
adjacent  patches  is  not.  A  natural  set  of  basis  vectors  is  given  by  the  derivatives  of  the 
surface  wdth  respect  to  the  two  surface  parameters  id  and  ir.  We  use  testing  functions  of 
the  form 


3,x(u)  ,,, 

t!A’(u)  =  ,  fa  ’(u). 


s/Ji  u)' 


(10) 


where  v=  1.  2  and  the  scalar  functions  /U  l(u)  are  the  same  as  those  used  in  the  3D  scalar 
case.  This  form  for  the  testing  functions  has  the  property  that  the  surface  divergence  of 


V.?(u)  '  ’ 


(ID 


since  9,.x(u)/Vg(u)  is  divergenceless  (see  Appendix  C).  This  form  for  the  divergence  of 
*!■*'(“)  (which  enters  into  the  computation  of  local  corrections  for  the  hypersingular  kernel) 
has  the  especially  desirable  property  that  it  avoids  the  need  to  compute  second  or  higher 
order  derivatives  of  the  surface. 


D.  Extent  of  Local  Correction  Domain 

When  local  testing  functions  are  used,  the  region  over  which  local  corrections  should  be 
computed  always  includes  the  patch  containing  the  field  point,  and  it  extends  out  to  include 
other  patches  until  the  underlying  quadrature  rule  is  accurate  enough  to  replicate  the  exact 
answer  to  within  a  desired  tolerance.  Since  the  testing  functions  have  local  support,  the 
problem  of  computing  local  corrections  for  a  region  containing  several  patches  decouples 
naturally  into  several  smaller  local  correction  problems,  one  for  each  patch.  The  tolerance 
should  be  based  on  an  estimate  of  the  optimum  accuracy  that  the  particular  discretization 
could  achieve;  there  is,  after  all.  little  to  be  gained  by  trying  to  evaluate  the  impedance 
matrix  more  accurately  than  what  is  warranted  by  the  discretization.  The  integrals  on  the 
right-hand  side  of  Eq.  (8)  can  be  computed  by  adaptive  integration  to  comparable  accuracy. 
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E.  Local  Corrections  for  “ Regular "  Parts  of  the  Kernel 

In  principle,  it  is  unnecessary  to  compute  local  corrections  for  regular  components  of 
the  kernel  because  they  will  be  efficiently  integrated  by  a  quadrature  rule  of  sufficiently 
high  order.  If  such  components  are  strongly  peaked,  however,  the  required  order  ntav  be 
so  high  that  it  is  computationally  more  efficient  to  treat  them  as  if  they  were  singular 
and  compute  local  corrections  for  them.  For  example,  the  scalar  kernel  n  •  V'G(x.  x')  in 
2D  or  3D  is  a  strongly  peaked  function  of  x'  when  the  field  point  x  is  close  to.  but  not 
on.  the  source  patch.  This  situation  arises  in  the  analysis  of  scattering  from  thin  lavers.  for 
example.  One  way  to  handle  this  problem  is  to  put  a  fine  discretization  on  each  layer,  in  effect 
subdividing  the  strongly  peaked  kernel  function  into  small  pans,  each  of  which  is  relatively 
smooth.  This  procedure  is  inefficient,  however,  because  it  uses  many  more  sample  points 
than  are  warranted  by  the  expected  spatial  structure  of  the  source.  A  better  approach  would 
be  to  discretize  each  layer  densely  enough  to  adequately  represent  the  sources  and  compute 
local  corrections  for  the  strongly  peaked  kernel.  Computing  such  local  corrections  can  be 
a  nontrivial  task  by  itself,  but  one  might  expect  that  the  extra  time  spent  in  precomputation 
would  be  compensated  by  a  less  time-consuming  solution  phase. 

F.  Using  the  Results 

F.l .  Computing  scattered  fields.  The  amplitude  of  a  scattered  wave  can  be  computed 
by  convolving  the  scattered  wave  with  the  source  distribution.  Even  though  a  Nystrom 
discretization  specifies  the  source  only  at  a  finite  set  of  points,  these  points  are  ideally 
suited  for  evaluating  integrals  in  a  high-order  fashion  by  virtue  of  Eq.  (2).  For  example,  the 
amplitude  F(k)  for  3D  scalar  scattering  of  a  source  distribution  t/rix)  on  a  surface  S  with 
Neumann  boundary'  conditions  (i.e..  n  •  V^(x)  =  0  for  x  on  5)  into  the  plane  w'ave  given 
by  <p(x)  =  eik  x  is 


F(  k) 


=  T~  <£  ds( n  • 
4?r  Js 


V <p* (x))  (x) 


4 it 


]Tcu,-(n(x,')  •  V<£*(x,)) 


(12) 

(13) 


where  the  sum  is  over  all  quadrature  points  and  *  indicates  complex  conjugation.  The 
extensions  to  other  forms  of  scattering,  whether  near-  or  far-field.  are  straishtforward. 

F.2.  Source  interpolation.  When  a  scattering  problem  is  solved  using  a  Galerkin  scat¬ 
tering  code,  it  is  obvious  how>  to  compute  the  value  of  the  source  distribution  at  any  point 
on  the  surface  because  the  solved-for  coefficients  multiply  basis  functions  that  are  uniquely 
defined  at  every  point  on  the  surface.  The  Nystrom  discretization,  on  the  other  hand,  returns 
values  of  the  sources  only  at  a  finite  set  of  discrete  sample  points,  so  that  determininc  the 
value  of  the  source  distribution  at  a  point  that  is  not  part  of  this  set  requires  interpolation. 

When  the  scattering  computation  is  performed  using  a  second  kind  integral  formulation, 
one  can  use  the  original  Nystrom  interpolation  formula,  augmented  by  local  corrections,  to 
interpolate  the  source  distribution.  As  an  example,  if  the  magnetic  field  integral  equation 
(MFIE)  is  used  to  solve  for  the  electric  current  distribution  J(x)  induced  on  a  perfectly 
electrically  conducting  (PEC)  scatterer  by  an  incident  magnetic  field  Hinc(x),  one  can  write 
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the  current  at  any  point  x  on  the  surface  S  as  [17] 


J(x)  =  2n(x)  x 


H'"‘(x)-  <p  ds'VG(x.x)  x  J(x') 


(14) 


L  Js  J 

We  obtain  an  interpolation  formula  from  this  continuous  equation  by  using  Eq.  (2)  to 
approximate  the  integral,  i.e.. 


J(x)  =  2n(x)  x 


-  y^w,y'G(x.  x,  )  X  J ( x, ) 


(15) 


where  the  sum  over  i  extends  over  all  sample  points  on  S .  Of  course,  to  make  this  a  high-order 
interpolation  formula,  it  may  be  necessary  to  compute  local  corrections  to  the  quadrature 
rule  at  source  points  in  the  vicinity  of  the  field  point  x. 

Another  interpolating  function,  which  does  not  require  computing  new  local  corrections 
and  is  usable  with  first  or  second  kind  integral  formulations,  takes  the  form  of  a  linear 
combination  of  the  functions  that  are  integrated  exactly  by  the  underlying  quadrature  rule. 
The  coefficients  may  be  determined  by  convolving  the  source  with  the  projection  operator 


I  (x.  x)  =  fnA*)(N  'W.u'l. 

m.n 


(16) 


where  the  summation  extends  over  all  functions  f,  (x)  for  which  the  quadrature  rule  is  exact, 
and  N  is  a  normalization  matrix  whose  components  are  given  by 


h  mn  —  I  d  S  ffti  ( X )  fu  ( X  ) . 


(17) 


If  the  fi(xV s  are  orthonormal  over  5.  then  AT  is  simply  the  identity  matrix.  Convolution 
with  1  (x.  x')  eliminates  the  part  of  a  function  that  is  orthogonal  to  all  the  f,(xY s.  If  we 
evaluate  the  convolution  of  /  (x.  x')  with  the  source  function  by  means  of  the  underlying 
quadrature  rule,  we  arrive  at  the  following  source  interpolation  function  5(x),  which  only 
requires  knowledge  of  the  source  at  the  discrete  set  of  sample  points  .v ( x, ) : 


5(X)  =  Y2  /»; 


( X )  ( A7 


W,/„(X,).f(X,). 


(18) 


The  summation  over  i  in  the  above  equation  extends  over  all  sample  points. 


IV.  RESULTS 

This  section  is  composed  of  two  parts.  The  objective  of  the  first  part  is  to  show  that  our 
most  recent  version  of  FastScat.  which  uses  a  Nystrom  discretization,  achieves  high-order 
convergence  to  the  correct  answer  for  a  few  small,  benchmark  problems  from  2D  scalar  and 
3D  vector  scattering.  In  the  second  part,  we  benchmark  the  performance  of  this  code  against 
two  Galerkin  codes,  comparing  them  on  the  basis  of  CPU  time  and  solution  accuracy. 


HIGH-ORDER  NYSTROM  DISCRETIZATION 


637 


A.  Validation 

The  most  common  practice  seen  in  the  literature  for  demonstrating  the  validity  of  a  scat¬ 
tering  code  is  to  show  that  the  results  obtained  from  the  code  with  a  particular  discretization 
compare  favorably  to  a  reference  solution  obtained  from  a  series  solution,  another  scattering 
code,  or  measurements.  Individual  results  such  as  this,  while  useful  and  necessary,  say  noth¬ 
ing  about  the  convergence  properties  of  the  algorithm  on  which  the  code  is  based.  To  show 
how  an  algorithm  converges,  one  must  compute  results  with  a  sequence  of  increasinglv  tine 
discretizations  and  observe  whether  and  how  the  results  converge  to  the  correct  answer. 

This  is  especially  important  when  validating  a  (purportedly)  high-order  code.  One  cannot 
expect  to  enjoy  the  benefits  of  a  high-order  code  (more  accurate  solutions,  solution  error 
control,  etc.)  on  large  scattering  problems  without  first  verifying  that  the  code  achieves  high- 
order  convergence  on  small  scattering  problems  (where  it  is  easierto  generate  solutions  with 
very  small  errors).  The  order  of  convergence  of  a  numerical  method  relates  to  the  rate  at 
which  the  error  in  the  computed  solution  decreases  as  the  discretization  scale  decreases. 
For  small  enough  discretization  scales  /?,  the  error  in  the  solution  computed  by  a  />th-order 
method  scales  as  hp.  The  results  presented  in  this  section  will  be  shown  to  follow  this 
scaling  law. 

The  benchmark  problems  include  a  circle  and  an  ellipse  in  2D.  and  a  sphere  and  an 
ellipsoid  in  3D.  In  the  2D  scalar  scattering  cases,  results  for  both  Dirichlet  and  Neumann 
boundary  conditions  on  the  surface  will  be  presented;  in  the  3D  vector  (electromagnetic) 
scattering  cases,  it  will  be  assumed  that  the  surfaces  are  perfect  conductors.  The  surface 
boundary  conditions  are  chosen  mainly  for  simplicity;  similar  convergence  behavior  has 
been  shown  for  other  types  of  boundary  conditions  (such  as  impedance  boundary  conditions 
and  dielectric  interfaces)  as  well. 

A.] .  Two-dimensional  scalar.  We  solved  four  different  integral  equations  to  obtain  2D 
scalar  scattering  results.  For  Dirichlet  boundary  conditions  (w'hich  correspond  to  the  TM 
polarization  case  of  electromagnetic  scattering  from  an  object  with  cylindrical  symmetry) 
the  first-kind  integral  equation  is 


<pmc(x)  =  ~  j>  dl'Gix.  x)a(x). 


and  the  second-kind  equation  is 


(19) 


-n  •  V0inc(x)  =  -a(x)  +  j)  di'( n'  •  V'G(x ,  \))o{\). 


(20) 


In  these  equations  </>,nc(x)  is  the  incident  scalar  field,  G(\.  x')  is  the  2D  scalar  kernel,  and 


n  and  n  are  the  unit  normals  to  the  contour  C  at  the  field  and  source  points,  respectively. 
For  this  polarization  case,  the  2D  scalar  source  o  is  proportional  to  the  z  component  of 
the  electric  current  J  in  the  corresponding  3D  vector  problem,  assuming  z  is  the  axis  of 
translational  symmetry. 

For  Neumann  boundary  conditions  (which  correspond  to  the  TE  polarization  case  of 
electromagnetic  scattering)  the  first-kind  integral  equation  is 

n  •  V<rc(x)  =  y  dl'( n  •  V)(n'  •  V'G(x,  x'))  i/r(x') 


(21) 
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and  the  second-kind  equation  is 

0,nc(x)  =  -  j,  dl'ii i  •  V'C (x.  x'))  Vox’).  (22) 

For  this  polarization  case,  the  electric  current  J  in  the  corresponding  3D  vector  problem, 
assuming  z  is  the  axis  of  translational  symmetry,  is  related  to  the  2D  scalar  source  \l/  bv 

J  =  rt  x  z.  (23) 

A  combined  field  equation  can  be  obtained  in  either  case  by  adding  the  first  and  second 
kind  equations  together  using  an  appropriate  combination  coefficient  [18].  Although  no 
combined  field  equation  results  are  reported  here,  it  should  be  noted  that  use  of  a  combined 
field  formulation  is  often  recommended  because,  by  being  insensitive  to  internal  resonances, 
it  can  improve  the  condition  number  of  the  impedance  matrix. 

A.l.a.  1  a- radius  circle .  A  circle  is  the  ideal  problem  for  benchmarking  a  hiiih-order 
scattering  code  because  its  surface  is  smooth  and  easy  to  define  exactly,  and  its  cross 
section  can  be  determined,  for  purposes  of  comparison,  to  arbitrary  accuracv  usinc  the  Mie 
series  [19].  We  used  FastScat  to  compute  the  bistatic  cross  section  of  a  1  A-radius  circle 
whose  surface  obeys  either  Dirichlet  or  Neumann  boundary  conditions,  which  correspond 
to  TM  and  TE  polarizations,  respectively.  Meshing  the  circle  consisted  of  dividing  it  into 
circular  segments  of  equal  arc  length.  Nystrom  sample  points  were  distributed  on  each 
patch  (parameterized  by  arc  length)  according  to  a  Gauss— Legendre  integration  rule  of  a 
given  order  and  Legendre  polynomial  testing  functions  up  to  half  this  order  were  used  for 
computing  local  corrections.  The  resultant  local  correction  linear  systems  are  square. 

We  performed  a  series  of  calculations  with  different  discretizations  (i.e..  different  numbers 
of  patches  and  different  Nystrom  quadrature  orders)  and  compared  the  results  to  the  Mie 
series  results  (shown  in  Fig.  1 ).  For  a  given  Nystrom  quadrature  order  (which  we  henceforth 
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FIG.  2.  Log-log  plot  of  maximum  relative  error  vs  unknown  density  for  I  /.-radius  circle  and  T\1  polarization. 
Each  set  of  points  is  labeled  by  Nystrom  order. 


abbreviate  to  Nystrom  order),  as  the  size  of  the  patches  decreases,  the  difference  between 
the  exact  result  and  the  FastScat  calculation  also  decreases. 

A  more  quantitative  measure  of  convergence  behavior  is  given  in  Fig.  2.  where  we  have 
plotted  maximum  relative  error  (defined  as  max[ja(0)/aKf(ff)  -  1 1],  where  a  (6)  and  crK,  ($) 
are  the  calculated  and  exact  cross  sections,  respectively,  for  0  =  0  to  1 80:  in  1 :  increments) 
versus  the  density  of  unknowns  for  a  first-kind  integral  formulation  of  the  TM  polarization 
case.  The  number  of  patches  spanning  the  circle  ranged  from  4  to  2048  and  the  Nystrom  order 
ranged  from  2  to  12.  One  of  the  important  features  to  note  is  that,  with  enough  unknowns, 
the  data  fit  a  linear  trend  line  whose  slope  increases  as  the  Nystrom  order  increases.  Since 
the  discretization  scale  h  is  inversely  proportional  to  the  density  of  unknowns,  a  linear  fit 
on  a  log-log  plot  of  error  versus  unknown  density  reflects  the  fact  that  the  error  scales 
asymptotically  as  hp,  where  p  (the  order  of  convergence)  increases  with  Nystrom  order. 
Large  values  of  p  signify  a  high-order  algorithm.  For  the  lower  Nystrom  orders,  the  slopes 
of  the  lines  connecting  points  of  a  given  order  are  observed  to  be  close  to  integers,  namely 
2  for  order  2:  3  for  order  4:  and  5  for  orders  6  and  8.  The  slopes  for  orders  10  and  12  are 
still  higher,  although  even  at  the  highest  sampling  densities  used,  the  discretization  error 
has  not  yet  reached  the  asymptotic  regime  where  each  would  be  expected  to  have  a  slope 
of  7. 

The  results  for  the  second-kind  integral  formulation  of  the  TM  polarization  case  are 
very  similar.  This  should  not  be  too  surprising,  since,  despite  the  additional  derivative,  the 
singularity  of  the  kernel  is  no  worse  than  log(r). 

The  corresponding  plot  for  the  TE  polarization  case,  also  using  a  first-kind  integral 
formulation,  is  shown  in  Fig.  3.  In  the  TE  case,  however,  the  first-kind  integral  equation 
involves  the  2D  hypersingular  kernel.  The  effect  of  using  a  more  singular  kernel  is  that  the 
source  must  be  represented  more  accurately  in  order  to  achieve  the  same  accuracy  in  the 
cross  section,  or  equivalently,  that  an  equally  well  represented  source  (i.e..  one  employing 
the  same  collection  of  unknowns)  produces  a  less  accurate  value  for  the  cross  section.  This 
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FIG.  3.  Log-log  plot  of  maximum  relative  error  vs  unknown  density  for  I  /.-radius  circle  and  TE  polarization. 
Each  set  of  points  is  labeled  by  Nystrom  order. 

is  easily  seen  by  comparing  Figs.  2  and  3.  For  a  given  discretization,  the  calculated  cross 
section  for  the  TE  case  is  two  or  more  orders  of  magnitude  less  accurate  than  that  for 
the  TM  polarization.  Nonetheless,  the  TE  polarization  data  also  fit  linear  trend  lines  with 
integer  slopes  when  the  discretization  is  fine  enough.  In  order  from  lowest  (2)  to  highest 
(12)  Nystrom  orders,  the  observed  slopes  are  2.  1.3.  3.  5.  and  5. 

Cross  section  calculations  resulting  from  the  second-kind  formulation  of  the  TE  polariza¬ 
tion  scattering  problem  are  generally  more  accurate  than  those  of  the  first-kind  formulation. 
In  fact,  as  the  Nystrom  order  increases,  they  become  nearly  as  accurate  as  those  for  the  TM 
polarization  case.  Again,  the  reason  is  that  the  singularity  of  the  kernel  for  the  second-kind 
TE  case  is  no  worse  than  log(r).  which  is  also  the  singularity'  of  the  kernels  in  the  first  and 
second-kind  TM  polarization  cases. 

The  process  of  improving  a  discretization  by  reducing  the  size  of  the  patches  is  called 
"h -refinement.”  This  is  what  has  been  exhibited  in  the  previous  two  figures.  Keeping  the 
number  of  patches  fixed  and  increasing  the  number  of  parameters  used  to  describe  the 
source  distribution  on  each  patch,  on  the  other  hand,  is  known  as  "/3-refinement."  With  a 
high-order  Ny'strom  code  such  as  FastScat.  /3-refinement  is  accomplished  by'  increasing  the 
Nystrom  order  for  a  given  meshing.  In  general,  this  is  the  preferred  method  for  improving 
a  discretization  for  two  reasons:  one  can  avoid  the  usually  tedious  process  of  remeshing  the 
scatterer.  and  the  accuracy  of  the  answer  usually  improves  faster  this  way.  The  data  in  the 
next  plot  demonstrate  this  feature. 

Figure  4  presents  the  TM  and  TE  polarization  data  given  in  Figs.  2  and  3  in  a  different 
way.  The  behavior  of  the  calculation  for  each  polarization  under  /r-refinement  is  illustrated 
by  connecting  points  corresponding  to  a  fixed  number  of  patches  instead  of  a  fixed  Nystrom 
order.  In  some  cases,  data  points  corresponding  to  Nystrom  orders  higher  than  12  have 
been  added.  The  fact  that  the  data  points  on  a  semilog  plot  can  be  connected  by  nearly 
straight  lines  indicates  that  /3-refinement  can  achieve  exponential  convergence,  as  opposed 
to  the  geometric  convergence  that  was  observed  for  h -refinement.  The  convergence  rate 
gets  higher  the  larger  the  patch  size. 


HIGH-ORDER  NYSTROM  DISCRETIZATION 


641 


FIG.  4.  Semilog  plot  of  maximum  relative  error  vs  unknown  density  for  scattering  from  a  1  /.-radius  circle. 
Points  corresponding  to  different  Nystrom  quadrature  orders  for  a  fixed  patch  size  are  connected  by  lines  (solid 
lor  TM  polarization  and  dashed  for  TE  polarization )  and  labeled  by  the  number  of  patches. 


With  regard  to  numbers  of  unknowns,  the  most  efficient  way  to  achieve  high  accuracy 
is  to  use  a  high-order  method  on  large  patches.  For  example,  with  only  four  patches  and 
a  30th-order  quadrature  rule,  it  was  possible  to  achieve  an  accuracy  of  10-6  for  the  TM 
polarization  case  and  1CT4  in  the  TE  case.  With  this  discretization,  the  unknown  density  is 
about  10  unknowns/wavelength  and  the  arc  length  of  each  patch  is  about  1  ±  wavelengths. 
For  lower  accuracies,  the  advantage  of  using  large  patches  and  high-order  methods  on  the 
circle  is  less  clear.  As  a  general  rule,  the  optimum  discretization  is  one  that  uses  large 
patches  and  high-order  methods  over  smooth  regions  of  the  scatterer  and  smaller  patches 
over  more  highly  curved  regions. 

A.l.b.  20  a  x  2  a  ellipse.  A  20  a  x  2  a  ellipse  is  a  2D  scatterer  that  is  less  symmetric 
than  a  circle,  but  is  still  smooth.  It  is  a  more  challenging  scattering  problem  than  a  1  A- 
radius  circle  for  several  reasons,  not  least  of  which  is  the  fact  that  it  extends  much  more 
than  a  wavelength  in  at  least  one  dimension.  In  addition,  it  is  a  good  candidate  problem  for 
applying  the  discretization  rule  described  above. 

In  our  code,  the  ellipse  is  described  by  the  pair  of  parametric  equations. 

.v  =  a  cos  it . 


where  a  =  1 0  /.  and  b  =  1  a.  A  sensible  patching,  which  puts  the  highest  density  of  patches 
in  the  most  highly  curved  regions  and  vice  versa  for  the  flatter  regions,  is  obtained  if  the 
patches  cover  equal  increments  in  the  parameter  it.  The  circumference  of  a  20  a  x  2  A  ellipse 
is  about  40.64  A. 

We  used  FastScat  to  compute  the  monostatic  cross  section  of  a  20  A  x  2  A  ellipse  dis¬ 
cretized  using  several  different  combinations  of  patch  number  and  Nystrom  order.  The 
boundary  conditions  on  the  surface  were  either  Dirichlet  or  Neumann,  corresponding  to 
TM  and  TE  polarizations,  respectively. 
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FIG.  5.  Monostatic  cross  section  of  a  20  /.  x  2  /.  ellipse  for  TM  and  TE  polarizations.  One  quadrant  of 
observation  angles  is  shown:  the  others  may  be  obtained  by  considering  the  fourfold  symmetry  of  the  scatterer. 


We  do  not  have  at  our  disposal  a  series  solution  for  the  cross  section  of  an  ellipse  (which  we 
might  otherwise  use  to  compute  an  arbitrarily  accurate  reference  solution).  However,  we  can 
still  estimate  the  accuracy  of  the  computed  solutions  by  comparing  them  to  the  most  finely 
discretized  solution,  which  we  designate  the  “reference  solution."  We  computed  reference 
solutions  for  the  TM  and  TE  polarization  cases  by  meshing  the  ellipse  into  1 28  patches  and 
putting  a  20th-order  Gauss-Legendre  rule  (i.e..  1 0  sample  points)  on  each  patch.  We  deduce 
that  these  reference  solutions  are  accurate  to  at  least  six  decimal  places,  given  the  high-order 
manner  in  which  all  the  more  coarsely  discretized  solutions  are  observed  to  converge  to 
them.  Plots  of  the  monostatic  cross  section  versus  incident  angle  for  the  reference  solutions 
are  given  in  Fig.  5.  As  seen  in  the  figure,  the  monostatic  cross  section  for  TM  polarization 
ranges  from  about  50  k  looking  at  the  broadside  to  less  than  0.1  k  looking  at  the  tip.  The 
TE  cross  section  is  similar,  although  it  is  not  as  smooth  a  function  of  angle.  In  both  cases, 
the  dynamic  range  of  the  cross  section  is  more  than  500. 

The  ^-refinement  behavior  of  the  calculations  on  the  ellipse  using  first-kind  integral 
equation  formulations  for  both  TM  and  TE  polarization  is  shown  in  Fig.  6.  Like  the  circle, 
exponential  convergence  is  observed  and  accurate  solutions  are  most  efficiently  obtained 
when  the  mesh  consists  of  patches  larger  than  a  wavelength. 

A. 2.  Three-dimensional  rector.  As  in  the  2D  scalar  case,  first-kind  and  second-kind 
integral  formulations  were  explored.  For  3D  vector  scattering  off  a  PEC  scatterer,  the  first- 
kind  formulation  is  the  electric  field  integral  equation  (EFIE)  [17] 


S(x) 


=  ia>  <j)  ds' 


-G(x,  x)  J(x)  +  —  V(V'G(x,  x)  •  J(x')) 


(25) 


and  the  second-kind  formulation  is  the  magnetic  field  integral  equation  (MFIE) 


HJ^(x)  =  — x  J(x)  -f-  ds’[VG(x.  x)  x  J(x')]ta 


(26) 
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FIG.  6.  Semilog  plot  of  maximum  relative  error  vs  unknown  density  for  scattering  from  a  20  a  x  2  a  ellipse. 
Points  corresponding  to  different  Nystrom  orders  for  a  fixed  patch  size  are  connected  by  lines  (solid  for  TM 
polarization  and  dashed  for  TE  polarization)  and  labeled  by  the  number  of  patches. 


where  G(\.  x')  =  exp(/*|x-x'|)/|x-x'|  is  the  Helmholtz  kernel  in  3D.  k  =  |k|  =eo/c  is 
the  radiation  wavenumber.  J  refers  to  the  electric  surface  current.  Eini;  and  Hinc  are  the 
incident  electric  and  magnetic  fields,  and  the  subscript  ran  means  that  only  the  vector 
components  tangent  to  surface  at  the  field  point  are  being  used. 

The  EFIE  and  MFIE  can  be  summed  to  form  a  combined  field  integral  equation  (CF1E) 
having  some  of  the  same  desirable  properties  as  the  CFIE  in  the  2D  scalar  case.  Although 
no  CFIE  results  are  reported  in  this  paper,  the  same  techniques  apply. 

Note  also.  that,  while  the  results  presented  here  are  restricted  to  PEC  scatterers.  it  is  trivial 
to  generalize  the  method  to  the  more  general  scattering  problem  of  homogeneous  regions 
with  smooth  boundaries. 

A.2.a.  One-fourth  k-radius  sphere.  Writing  a  code  that  correctly  calculates  3D  vector 
scattering  results  is  more  difficult  than  writing  a  correct  2D  scalar  code.  This  is  doubly  true 
if  the  code  is  designed  to  be  high  order.  Therefore,  it  is  particularly  important  to  verify 
that  the  output  of  a  purportedly  high-order  3D  vector  code  actually  converges  to  the  correct 
answer  under  both  h-  and  /^-refinement  and  that  it  does  so  in  a  high-order  fashion.  In  this 
subsection,  we  present  results  demonstrating  that  our  3D  vector  Nystrom  code  achieves 
high-order  convergence  to  the  correct  answer  on  a  sphere. 

A  sphere  is  the  ideal  surface  to  use  for  benchmarking  a  high-order  3D  vector  code  for  the 
same  reasons  that  a  circle  is  ideal  for  a  high-order  2D  scalar  code — it  is  uniformly  smooth 
and  the  accuracy  of  computed  results  can  be  determined  by  comparison  to  the  Mie  series 
solution.  Since  the  size  of  the  surface,  and  therefore  the  number  of  unknowns,  grows  in 
proportion  to  r2  for  a  sphere,  as  opposed  to  just  r  for  a  circle,  memory  limitations  prevented 
us  from  pushing  the  unknown  density  on  a  1  1-radius  sphere  to  the  same  extremes  as  were 
possible  on  a  1 1-radius  circle.  Nonetheless,  when  we  did  run  FastScat  on  a  1 1-radius  sphere 
with  a  wide  selection  of  discretizations,  we  found  that  the  results  converged  to  the  correct 
answer  just  as  one  would  expect  for  a  high-order  scattering  code.  To  reach  the  asymptotic 
regime,  where  the  convergence  behavior  is  more  obvious,  however,  we  chose  the  radius 
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TABLE  I 


3D  Quadrature  Rule  and  Testing  Function  Parameters 


Maximum 

Nystrom 

Number 

testing 

Number 

quadrature 

sample 

function 

testing 

order 

points 

degree 

functions 

-> 

1 

0 

1 

3 

3 

1 

3 

s 

6 

-> 

6 

7 

12 

3 

10 

8 

15 

4 

15 

of  the  sphere  to  be  ~  a.  which  allows  us  to  increase  the  unknown  density  fourfold  before 
running  out  of  primary  memory  (for  storing  the  full  impedance  matrix).  For  this  reason 
alone  we  present  the  data  for  the  \  A-radius  sphere. 

The  internal  surface  representation  of  the  sphere  corresponds  to  an  ideal  sphere  and 
its  surface  is  assumed  to  be  perfectly  conducting.  The  coarsest  patching  of  the  sphere 
consists  of  20  identical  triangular  patches,  formed  by  mapping  the  triangles  of  an  inscribed 
icosahedron  onto  the  surface  of  the  sphere.  Finer  meshes  were  generated  by  dividing  each 
of  the  20  triangles  into  n2  nearly  identical  subtriangles,  where  n  ranged  from  2  up  to  1 0.  The 
distribution  of  Nystrom  quadrature  points  on  each  patch  was  determined  by  a  high-order 
triangle  rule  [16].  The  triangle  rule  orders  that  we  used  and  corresponding  numbers  of 
sample  points  are  given  in  Table  I.  The  number  of  testing  functions  (products  of  monomials 
in  the  two  surface  parameters)  and  the  maximum  degree  of  the  testing  functions  used  with 
each  triangle  rule  are  also  listed  in  the  table. 

In  all  cases  except  Nystrom  order  7.  the  number  of  sample  points  equals  the  number  of 
testing  functions,  resulting  in  an  exactly-determined  local  correction  linear  system.  In  the 
seventh-order  case,  the  maximum  testing  function  degree  was  chosen  to  make  an  under¬ 
determined  linear  system. 

Solutions  for  the  bistatic  cross  section  of  the  ^  A-radius  sphere  were  computed  with  the 
various  discretizations  and  compared  against  the  Mie  series  solution  (shown  in  Fig.  7).  For 
a  sphere  this  small,  the  cross  sections  for  the  two  polarizations  are  similar  (in  terms  of 
smoothness  and  dynamic  range),  so  we  present  the  discretization  refinement  results  only 
for  the  06  case.  Cross  polarization  results  are  also  not  presented  at  all,  although  it  may  be 
noted  that  such  computed  cross  sections  were  extremely  small  (i.e.,  always  less  than  the 
co-polarized  results  by  at  least  eight  orders  of  magnitude). 

The  convergence  behavior  of  the  scattering  results  under  //-refinement  is  shown  in  Fig.  8. 
Refining  the  mesh  for  a  given  Nystrom  order  always  improves  the  accuracy  of  the  solution. 
It  is  apparent  for  the  lower  Nystrom  orders  that  the  data  approach  linear  trend  lines  with 
integer  slopes  as  the  patches  get  smaller,  just  as  they  did  in  2D.  In  the  case  of  the  EFIE, 
the  slopes  of  the  trend  lines  for  Nystrom  orders  2  and  3  are  both  unity  and  in  the  case  of 
the  MFIE,  they  are  2  and  3.  respectively.  For  the  higher  orders,  the  slopes  appear  to  be 
increasing,  but  it  is  not  as  clear  what  their  asymptotic  values  will  be.  For  Nystrom  order  5, 
the  last  pair  of  points  produce  slopes  close  to  3  and  5  for  the  EFIE  and  MFIE  solutions, 
respectively.  In  all  cases,  the  solution  at  a  particular  discretization  obtained  by  using  the 
less  singular  kernel  (i.e..  the  MFIE)  is  more  accurate. 
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FIG.  7.  Bistatic  cross  section  of  a  j  /.-radius  PEC  sphere  for  00  and  4><p  polarizations  computed  by  the  Mie 
series. 

The  behavior  of  the  sphere  results  under  p-refinement  are  shown  in  Fig.  9.  The  observed 
p-refinement  behavior  is  similar  to  that  in  the  2D  scalar  case.  The  fastest  convergence  is 
usually  achieved  by  applying  a  high-order  quadrature  to  a  coarse  meshing.  One  notable 
difference  from  the  2D  scalar  case  is  that  the  3D  vector  calculation  requires  a  higher  density 
of  unknowns  to  achieve  a  comparable  maximum  relative  error  in  the  bistatic  cross  section. 
The  jaggedness  of  the  p-refinement  curves  for  the  EFIE  data  may  be  explained  by  reference 
to  the  h -refinement  plot,  which  shows  that  the  2nd-  and  3rd-order  results  have  nearly  the  same 
accuracy,  and  that  the  7th-order  results  are  actually  less  accurate  than  those  for  5th-order. 
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FIG.  8.  Log-log  plot  of  maximum  relative  error  vs  unknown  density  for  ]-  X-radius  PEC  sphere  in  00  po¬ 
larization.  Points  obtained  with  different  meshings  but  the  same  Nystrom  order  are  connected  by  lines.  A  solid 
(dashed)  line  indicates  use  of  the  EFIE  (MFIE)  integral  formulation. 


646 


CAN1N0  ET  AL. 


FIG.  9.  Semilog  plot  of  maximum  relative  error  vs  unknown  density  for  scattering  from  a  ±  /.-radius  PEC 
sphere.  Points  corresponding  to  different  Nystrom  quadrature  orders  for  a  fixed  patch  size  are  connected  by  lines 
( solid  for  MFIE  and  dashed  for  EFIE)  and  labeled  by  the  number  of  patches. 

For  Nystrom  orders  higher  than  about  8.  problems  related  to  ill-conditioning  arise  in  the 
EFIE  formulation.  Although  the  increasingly  ill-conditioned  nature  of  the  local  correction 
linear  system  is  a  contributing  factor,  the  more  important  contribution  probably  comes  from 
the  fact  that  the  EFIE  is  especially  susceptible  to  conditioning  problems  when  the  Nystrom 
sample  points  get  too  close  together.  Unfortunately,  this  is  exactly  what  happens  for  the 
higher-order  triangle  rules.  As  the  order  increases,  the  quadrature  points  tend  to  bunch  up 
near  the  edges  and  comers  of  the  triangle.  It  may  be  possible  to  overcome  this  problem  by 
inventing  different  high-order  triangle  rules  with  better  sample  point  spacing  and  by  usinc 
a  better  conditioned  integral  equation  formulation  such  as  the  MFIE  or  CFIE  (combined 
field  integral  equation). 

A.2.b.  2  /.  x  2  a  x  0.2  A.  ellipsoid.  As  an  example  of  a  smooth,  but  less  symmetric  3D 
scatterer.  we  next  consider  a  PEC  ellipsoid  with  principal  axis  diameters  2  a.  2  a.  and  0.2  X. 
We  computed  the  monostatic  cross  sections  of  this  discus-shaped  scatterer  in  06  and  <f>4> 
polarizations  using  a  MFIE  formulation  and  an  eighth-order  quadrature  rule,  which  put 
1 3  points  on  each  patch.  Four  different  meshings,  comprising  20.  80.  1 80.  and  320  patches, 
were  tried.  Each  meshing  was  tailored  to  put  smaller  patches  in  the  vicinity  of  the  /•  =  1  X 
equator,  where  the  one  of  the  radii  of  curvature  is  small,  and  larger  patches  everywhere  else, 
where  the  surface  is  relatively  flat.  The  number  of  unknowns  distributed  over  the  6.47  X2 
surface  of  the  ellipsoid  in  the  four  cases  ranged  from  600  with  the  coarsest  meshing  to  9600 
with  the  finest. 

As  we  did  with  the  ellipse  in  2D,  we  can  designate  the  solution  computed  with  the 
finest  discretization  to  be  the  reference  solution  and  obtain  accuracy  estimates  of  the  other 
solutions  by  comparing  them  to  this  reference  solution.  Figure  10  shows  the  reference 
solutions  for  the  66  and  4><p  polarization  cases. 

Differences  between  the  reference  solution  and  the  other,  less  finely  discretized  solutions 
are  shown  in  Fig.  1 1.  As  expected,  the  accuracy  of  the  solution  improves  as  one  refines  the 
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FIG.  10.  Reference  solutions  for  the  monostatic  cross  section  of  a  2  a  x  2  k  x  0.2  a  PEC  ellipsoid  in  HH  and 
<p<p  polarizations.  At  0  the  observ  er  is  looking  at  the  flattest  pan  of  the  ellipsoid:  at  90  he  is  looking  edge  on. 


discretization.  It  should  also  come  as  no  surprise  that  the  solutions  are  also  most  accurate 
near  0:  and  180" .  where  the  cross  section  is  highest.  What  is  particularly  notable  about  this 
plot,  however,  is  the  fact  that  the  error  in  the  cross  section  decreases  by  orders  of  magnitude 
when  one  reduces  the  (linear)  size  of  each  patch  by  factors  of  2  or  3.  Such  large  reductions  in 
the  error  are  a  direct  consequence  of  our  using  an  exact  surface  description  and  a  hish-order 
rule  (8th-order.  in  this  case)  on  each  patch. 


0  30  60  90  120  150  180 

Angle  (degrees) 

FIG.  1 1.  Semilog  plot  of  the  differences  between  cross  sections  computed  using  meshings  consisting  of  20. 
80.  and  180  patches,  and  a  reference  cross  section  computed  using  a  meshing  consisting  of  320  patches.  The 
asymmetry  of  each  curve  reflects  the  fact  that  the  meshings  did  not  possess  reflection  symmetry. 
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B.  Run-Time  Performance  Comparisons 

In  this  section  we  compare  the  run-time  performance  of  our  high-order  Nystrom  im¬ 
plementation  of  FastScat  to  that  of  two  method  of  moments  scattering  codes.  The  first 
comparison  code  is  an  earlier,  high-order  Galerkin  implementation  of  FastScat  [20].  The 
second  is  a  low-order  code  (RWG  basis  and  testing  functions  on  flat  facets)  called  FISC 
[21].  We  ran  each  code  under  comparable  conditions  to  obtain  solutions  for  the  bistatic 
cross  section  in  the  00  polarization  of  three  different  size  PEC  spheres.  The  high-order 
Nystrom  discretizations  were  constructed  using  an  eighth-order  quadrature  rule  ( 15  sample 
points  per  patch)  and  fourth-degree  testing  functions  for  computing  local  corrections.  The 
high-order  Galerkin  discretizations  were  constructed  from  the  same  surface  mesh  using 
patch-based,  polynomial  (in  the  parameterization)  basis  functions  up  to  degree  4  to  give 
the  same  number  of  unknowns  per  patch,  namely  30.  The  surface  mesh  used  by  FISC  was 
necessarily  different  from  that  used  by  both  versions  of  FastScat  because,  with  an  RWG 
discretization,  one  unknown  is  associated  with  each  edge  rather  than  multiple  unknowns 
being  associated  with  each  patch.  Nonetheless,  its  surface  meshes  were  constructed  to  main¬ 
tain  the  density  of  unknowns  at  about  7.7  unknowns/wavelength,  the  same  as  for  the  both 
FastScat  discretizations.  All  computations  were  performed  using  a  dense  matrix  fill,  an 
LUD  solver,  and  a  MFIE  formulation. 

Table  II  gives  a  summary  of  the  results.  The  reported  times  are  run  times  on  a  SPARC- 10 
workstation  with  512  MB  primary  memory.  The  total  run  time  is  broken  into  setup  time 
(w'hich  includes  the  time  spent  setting  up  the  problem  and  filling  the  impedance  matrix)  and 
solve  time  (which  includes  the  time  spent  performing  the  LUD  and  solving  for  the  bistatic 
cross  section  at  181  angles). 

In  comparing  the  results  from  the  tw'o  high-order  implementations  of  FastScat,  two  fea¬ 
tures  are  especially  noteworthy.  The  first  is  that  the  high-order  Galerkin  result  is  more 
accurate  by  about  a  factor  of  5  than  the  high-order  Nystrom  result.  The  second  is  that  use 
of  the  Nystrom  discretization  can  speed  up  the  setup  phase  of  the  computation  enormously, 
with  the  speedup  factor  increasing  as  the  number  of  unknowns  increases.  The  observation 
that  the  high-order  Galerkin  code  computes  results  somewhat  more  accurately  than  the 
Nystrom  code  is  consistent  with  our  experience  computing  cross  sections  for  other  scatter¬ 
ed.  both  in  2D  and  3D.  It  is  compensated,  however,  by  the  fact  that  the  setup  phase  (and 
to  a  lesser  extent  the  solve  phase)  runs  much  faster  using  the  Nystrom  code.  Furthermore, 
the  factor  of  5  difference  in  accuracy  is  actually  less  significant  in  this  case  than  it  w'ould 


TABLE  II 

Nystrom  vs  Galerkin  Performance  on  PEC  Spheres 


Scattering  code 

Radius 

(A) 

No.  of 

unknowns 

Setup 
time  (s) 

Solve 

time  (s) 

RMS 

error  (dB ) 

FastScat  (Nystrom) 

0.9 

600 

74 

36 

0.35 

FastScat  (Galerkin) 

0.9 

600 

972 

88 

0.07 

FISC  (Galerkin) 

0.9 

600 

83 

42 

1.28 

FastScat  (Nystrom) 

1.8 

2400 

539 

2742 

0.26 

FastScat  (Galerkin) 

1.8 

2400 

8177 

3395 

0.05 

FISC  (Galerkin) 

1.8 

2430 

873 

2255 

0.61 

FastScat  (Nystrom) 

2.7 

5400 

1953 

31735 

0.097 

FastScat  (Galerkin) 

2.7 

5400 

38803 

36152 

0.021 

FISC  (Galerkin) 

2.7 

5880 

8230 

28795 

0.723 
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be  if  we  were  comparing  low-order  codes.  Given  the  OUiq)  convergence  rate  expected  of 
an  eighth-order  quadrature  rule,  it  should  be  possible  to  recover  the  factor  of  5  in  accuracy 
with  further  //-refinement  by  a  modest  209 r. 

The  high-order  Nystrom  code  computes  more  accurate  answers  than  the  low-order 
Galerkin  code  (FISC)  in  all  cases.  For  the  spheres  considered  here,  this  is  largely  due 
to  the  fact  that  FISC  uses  a  low-order  surface  representation.  The  high-order  Nystrom  code 
also  requires  less  setup  time,  an  advantage  that  grows  as  the  problems  get  bigger.  Even  a 
comparison  based  on  total  solution  time  shows  the  high-order  Nystrom  implementation  of 
FastScat  to  be  more  efficient  for  computing  accurate  answers. 

Finally,  it  is  useful  to  note  that  an  equivalent  Nystrom  discretization  exists  for  even- 
method  of  moments  discretization  and  vice  versa  [22],  so  it  is  possible,  at  least  in  principle, 
to  eliminate  the  observed  accuracy  discrepancy  between  the  two  versions  of  FastScat  by 
implementing  a  Nystrom  code  whose  discretization  error  precisely  matches  that  obtained  by 
the  Galerkin  code.  We  have  not  attempted  to  do  this,  but  suspect  that  to  do  so  would  entail 
additional  complications  and  computations  that  would  negate  the  substantial  simplicity 
and  efficiency  of  the  present  implementation.  On  balance,  we  find  the  high-order  Nystrom 
method  in  its  present  form  preferable  to  the  high-order  Galerkin  method  for  solving  integral 
equations,  especially  when  one  adds  in  its  other  benefits  such  as  reduced  implementation 
complexity  and  potential  for  significantly  improved  FMM  performance. 


V.  SUMMARY 

The  standard  Nystrom  method  is  a  simple  and  efficient  mechanism  for  discretizing  inte¬ 
gral  equations.  We  have  shown  how  it  can  be  adapted  to  provide  a  high-order  discretization 
of  the  boundary  integral  equations  of  wave  scattering  in  2D  and  3D,  which  have  singular 
kernels.  Numerical  results  obtained  with  a  software  implementation  of  this  method  show 
that  the  algorithm  can  achieve  high-order  convergence  to  the  correct  answer  for  scattering 
cross  sections  in  2D  and  3D.  We  also  demonstrated  that  a  high-order  Nystrom  code  consid¬ 
erably  reduces  the  CPU  time  cost  of  a  scattering  calculation  by  comparison  to  a  high-order 
Galerkin  code,  especially  the  precomputation  time  cost.  The  high-order  Nystrom  code  also 
outperformed  a  well-tuned.  low-order  Galerkin  code  (FISC)  in  terms  of  solution  accuracy 
and  total  run  time.  Demonstrations  of  how  a  high-order  Nystrom  code  can  be  used  in  con¬ 
junction  with  the  FMM  to  reduce  the  memory  and  CPU  time  requirements  of  solving  large 
scattering  problems  will  be  the  subject  of  a  future  publication. 


APPENDIX 


A.  Local  Corrections 

Eleven  different  kernels  arise  in  boundary  integral  equation  formulations  of  2D  scalar, 
3D  scalar,  and  3D  electromagnetic  scattering: 


2D  &  3D  Scalar 

3D  Electromagnetic 

G(r) 

n  •  V'G(r) 

n  •  VG(r) 

(n  •  V)(n'  -  V'G(r)) 

G(r)(t(x)  •  t'(x')) 

t(x)  •  (V'G(r)  x  t'(x')) 

(t(x)-  V)(V'G(r)-t'(x')) 
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where 


[  t (At)  in  2D. 

C</->=<  ...  (27) 

[V  in  3D. 

r  is  the  magnitude  of  the  vector  r  =  x  —  x  from  the  field  point  at  x  to  the  source  point 
at  x':  k  is  the  wavenumber  of  the  waves:  n  and  n'  are  the  unit  normals  to  the  surface  at 
the  field  and  source  points,  respectively:  V  and  V'  are  gradient  operators  for  the  Held  and 
source  coordinates,  respectively:  and  Wo"  refers  to  the  zeroth  order  Hankel  function  of  the 
first  kind,  defined  by  //()  (.v)  =  ,/o(.v )  +  /  KqI.v  ).  where  J„(.\ )  and  }'„(.v )  represent  //th-order 
Bessel  functions  of  the  first  and  second  kinds,  respectivelv. 

For  the  3D  electromagnetic  case,  the  source  and  excitation  are  surface  tansent  vectors  so 
it  becomes  necessary  to  compute  local  corrections  for  four  scalar  kernels,  one  for  each  of  the 
four  combinations  of  (two)  independent  surface  tangent  vectors  at  the  field  point  and  (two) 
independent  surface  tangent  vectors  at  the  source  point.  These  surface  tangent  vectors  at 
the  field  and  source  points,  represented  by  t(x)  and  t  (x  ).  respectively,  are  included  as  part 
of  the  3D  electromagnetic  kernel  in  recognition  of  this  fact  and  for  clarity  of  presentation. 

In  this  section,  we  show  how  to  compute  local  corrections  for  each  of  these  kernels.  We 
will  make  use  of  the  vector  calculus  identity  [23] 

(n  •  V)(n'  •  V>0 ■))  =  (n  ■  n')(V  •  V'g(r))  -  <n  x  V)  •  <n'  x  V>(/  ))  (28) 

=  (n  •  h')k2g(r)  —  (n  x  V)  •  (n  x  V'g(r)).  (29) 

w'here  the  second  line  follows  if  g(r)  obeys  the  homogeneous  Helmholtz  equation 

( V2  4-  A’2  )«(/•)  =  0.  (30) 

This  identity  allows  one  to  convert  between  double  normal  derivative  and  double  tangential 
derivative  operators  on  the  Green  function. 

A.l .  Two-dimensional  scalar. 

A.  I  .a.  G(r), 


G(r)  =  "(At)  =  ~./q(At)  -  ^T0(At).  (31) 

regular  singular 

This  kernel  may  be  w'ritten  as  the  sum  of  a  regular  part  and  a  singular  part.  It  is  necessary 
to  compute  local  corrections  only  for  the  singular  part  because  the  regular  part  will  be 
efficiently  integrated  by  the  underlying  high-order  quadrature  rule.  The  function  Y0(kr) 
contains  a  log(r)  singularity.  Therefore,  one  can  use  "I in-log"  quadrature  rules  [24]  to 
efficiently  compute  local  correction  integrals  w'hen  the  region  of  integration  contains  the 
field  point,  and  Gauss— Legendre  rules  otherwise. 

A.l.b.  n  •  V'G(r), 


n'  •  V'G(r) 


regular 


regular 


n '•  r  d 

r  dr 


G(r 


regular  / - * - -  singular 

1  .  '  J\(kr)  1  n  r/— ^ 

=  "3*" n  r)  —  +  ~t  —  h  v'an- 


reculur 


singular 


(32) 


\ 
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The  first  term  is  regular:  the  second  is  singular.  The  second  term  is  singular  not  because 
its  value  diverges  at  the  origin  (in  fact.  lim,._0(n'  •  r/r2)krY\  (kr)  =  1/rr /?.  where  R  is  the 
radius  of  curvature  of  the  surface  at  the  field  point),  but  because  its  higher  derivatives  do. 
The  singularity  is  still  a  logo* )  singularity,  so  local  correction  integrals  can  be  computed  in 
the  same  manner  as  for  the  previous  kernel. 

A.l.c.  n  •  VG(r). 


n  ■  YG(r)  =  - 


n  r  d 

r  dr 


regular 

regular 

i ,  ■•A — -  J]  (At) 
A-(n  ■  r)  — — 
4  kr 


recular 


singular 


1  n  r ' 

- — —  kri'i(kr).  (33) 

4  r- 


recular 


sincuiar 


This  kernel  is  identical  to  that  for  n  •  V'G(r)  with  n'  replaced  by  -n  and  it  has  similar 
properties. 

A.l.d.  (n  •  V)(n'  •  V'G(r)). 


(n  •  V)(n'  •  V'G(r)) 

(n  ■  r)(n'  •  r)  /  1  dG{r)  d2G(r)\  (n  •  n ')dG(r) 


(  resi 


ik2 

T 


resular 


r  dr 

recular 


dr2 

recular 


— s  regular 


r  dr 

A 


Ji(kr)  (n  ■  r)(n  •  r) 

(n  •  n  )  — - - - - 

kr  r- 


regular 


4-  (n  •  V)(n'  •  ¥'GR(r)). 

v  _  v 

hypersingular 


(34) 


(35) 


Applying  the  derivatives  to  the  real  part  of  G(r).  namely  GR(r)  =  -^Yo(kr),  produces  a 
term  that  is  not  merely  singular  but  hypersingular.  When  convolved  with  a  regular  function, 
this  term  is  not  (in  general)  integrable  because  it  diverges  like  1  /r2,  relative  to  the  field  point. 
The  following  discussion  shows  how  to  manipulate  it  into  a  form  that  allows  numerical 
evaluation  when  the  region  of  integration  contains  the  field  point.  When  the  region  of 
integration  does  not  include  the  field  point.  Gauss— Legendre  rules  may  be  used. 

The  convolution  of  (n  •  V)(n  •  VGR(r))  with  testing  function  fix')  is 

y  dl'in  ■  V)(n'  •  V'Gft(r))/(x').  (36) 

Strictly  speaking  this  is  not  a  proper  integral  unless  it  is  assumed  to  represent  the  limiting 
value  as  the  field  point  approaches  the  surface  from  off  the  surface.  We  implicitly  make 
this  assumption  throughout.  Using  the  vector  identity  (29)  and  the  fact  that  GR(r)  obeys 
the  homogenous  Helmholtz  equation  when  x  is  not  on  S,  we  can  convert  the  double  normal 
derivative  operator  to  a  double  tangential  derivative  operator: 


dl'[k2{ n  •  n)GR(r)  -  (n  x  V)  •  (n'  x  V'G V))]/(x'). 


(37) 
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In  2D.  we  can  rewrite  the  second  term  even  more  explicitly  in  terms  of  tangential  derivatives, 
obtaining 


I  dl'[kz( n  •  n')GV)  -  (t  ■  V)  (t'  •  V'G 


*(r))]/(x'>. 


(38) 


(39) 


w'here  t  and  t  are  unit  tangent  vectors  at  the  field  and  source  points,  respectively  The  first 
term  has  a  log(r)  singularity,  which  we  already  know  how  to  integrate  numerically:  the 
second  term  is  hypersingular  and  requires  further  manipulation. 

The  gradient  operators  V  and  V'  commute  with  the  unit  tangent  vectors  t  and  t,  respec¬ 
tively.  so  we  can  rearrange  the  factors  of  the  second  term  and  integrate  it  bv  parts  as 

-  J  V)(t'-  V'G*(r))/(x') 

=  -J  dl'f(\)l  ■  V'(t-  VGV)) 

=  ~  f  dl't'-  V'(/(x')(t-  VG* (/•))) 

+  J  dl'Ct'  •  V'/(x'))(t- VO* (/•)). 

The  first  integral  on  the  right-hand  side  of  (40)  is 

-  J  dl't'  ■  V'(/(x')(t  •  VG  V))) 

=  -  J  d\'  ■  V'(/ (x')(t  •  VGR {>  ))) 


(40) 


=  -[/(x')(t-  VGfi(r))]c;: 


(41) 


(42) 


i.e..  since  the  integrand  is  a  total  derivative,  the  value  of  the  integral  is  a  difference  of  values 
at  the  endpoints.  Rearranging  factors  and  using 


VGV)  =  -V'CV). 

we  can  rewrite  the  second  integral  as 


-J  d!'V‘GR 


(r)-[t(t'.V7(x'))]. 


(43) 


(44) 


In  this  form,  the  integral  is  not  yet  evaluable  because  ¥'GR(r )  diverges  like  1  /r  relative  to 
the  field  point.  We  can  make  it  integrable  by  adding  and  subtracting  a  smooth  function  that 
matches  the  integrand  at  the  field  point.  Specifically,  let  us  write  (44)  as 

-  dl'  V'Gfl(/-)  •  ft(t'  •  V'/(x'))  -  t'(t  •  V'/(x»]  -  J  dl'  VG*{r)  ■  [t'(t  •  V'/(x))]. 


(45) 
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where  t'  •  V'/<  x )  and  t  -  V'/(x)  represent  tangential  derivatives  of  the  testing  function  / 1  \  ) 
evaluated  at  the  field  and  source  points,  respectively.  The  first  integral  in  this  expression  is 
integrable  because  the  zero  of 

[tit'  •  V'/(x'))  -  t  (t  ■  Vf  (x))]  (46) 

at  the  field  point  cancels  the  pole  from  V  Gff(r)  at  the  field  point,  leaving  a  singularity  no 
worse  than  log(r)  relative  to  the  field  point.  By  rearranging  factors,  the  integrand  of  the 
second  integral  can  be  shown  to  be  a  total  derivative,  so  that 


dl'  V'G*(r)  •  [t'(t  ■  V'/(x))] 


df  t  •  (V'GR(r)(t  •  V'/(x))) 


=  -[GV)(t-  V'/(x))]£. 


(47) 

(48) 


Putting  the  various  terms  together,  we  arrive  at  the  following  numerically  tractable  expres¬ 
sion  for  the  integral  needed  to  compute  local  corrections  for  the  hypersingular  component 
of  the  kernel 


As  in  the  2D  scalar  case,  this  kernel  may  be  written  as  the  sum  of  a  regular  part  and  a  singular 
part.  It  is  necessary  to  compute  local  corrections  only  for  the  singular  part  because  the 
regular  part  will  be  efficiently  integrated  by  the  underlying  high-order  quadrature  rule. 
The  singular  term  contains  al/r  singularity.  Computing  local  corrections  for  the  singular 
part  requires  evaluation  of  integrals  of  cos (kr)/r  times  polynomials  in  the  parameters 
u  =  U<].  u2)  used  to  describe  the  surface.  When  the  region  of  integration  contains  the  field 
point,  it  may  be  subdivided  into  triangles  with  the  field  point  at  one  vertex,  and  the  integration 
may  be  performed  by  using  the  Duffy  transformation  [25]  and  Gauss— Legendre  product 
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rules  on  the  subtriangles.  Otherwise,  one  can  apply  efficient  quadrature  rules  for  smooth 
functions  such  as  high-order  triangle  rules  [16], 

A.2.b.  n'  •  V'G(r). 


n'  •  V'G(r) 


nrd  ( ikr  -  \  )elkr  n  ■  r 

- —  G(/)  =  - ; - 

r  dr  r-  r 


(52 1 


regular 

/ - ^ - s 

,  (cos (At)  - 
/  k  - - — 

ikr)1 

V 

regular 


recular 


recular 


<n  •  r)  -  (cos(At)  (At)  sin(Av)) 


sincular 


(53) 


In  2D.  (n  •  r)/r2  is  a  regular  function  with  a  removable  singularity  at  the  origin.  In  3D.  the 
singularity  is  removable  only  if  the  principal  radii  of  curvature  of  the  surface  at  the  field 
point  are  the  same.  Otherwise  its  limiting  value  depends  on  the  direction  from  which  the 
origin  is  approached.  Nonetheless,  local  correction  integrals  can  be  computed  efficientlv  by 
means  of  triangle  subdivision  and  the  Duffy  transformation. 

A.2.c.  n  •  VG(r), 


recular 


n  •  VC(r )  —  —ik 


3(  cost  At)  - 
(At)2 


recular 


recular 


sincular 


— s  * - ^ - s  ( n  •  r )  I 

<n  •  r) +  (cos(At)  +  (AT)sin(AT))  — - - .  (54) 

/•-  /■ 


recular 


sincular 


This  kernel  is  identical  to  that  for  n  •  V  G(r)  with  n  replaced  by  — n  and  has  similar 
properties. 

A.2.d.  (n  •  V)(n'  •  V'C(r)). 


(n  •  V)(n'  •  V'G(/-)) 

1  —  ikr 


=  (n  •  n') 


=  ik3 


r 


recular 


e'kr  +  (n  •  r)(n'  •  r) 


A2/-2  +  3/At  -  3 


ikr 


recular 


(55) 

\ 


'/ sin.tr,  ~T  V  re^lar  sjnrtM  _  ,  /  ^  -  cos.tr)  \  regular 

(  kr  cos(A/ ))  ,  ,s  tr  ‘’t  Ur i-  j  •'T - 'N_; — ' 

OFY1  (n  ■  n  )  +  A- - — - (n  •  r)(n  •  r) 


recular 


+  (n-V)(fi'-V,G/?  (/•)). 


(56) 


hypersingular 


Applying  the  derivatives  to  the  real  part  of  G(r).  namely  GR  (r)  =  cos(At)/;-,  produces  a 
term  that  is  not  merely  singular  but  hypersingular.  When  convolved  with  a  regular  function, 
this  term  is  not  (in  general)  integrable  because  it  diverges  like  1  /r3  relative  to  the  field  point. 
The  following  discussion  shows  how  to  manipulate  it  into  a  form  that  allows  numerical 
evaluation  when  the  region  of  integration  contains  the  field  point.  When  the  region  of 
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integration  does  not  include  the  field  point,  standard,  high-order  rules  for  integrating  regular, 
two-parameter  functions  may  be  used. 

The  convolution  of  (n  •  V)(n'  •  V'G*(r))  with  testing  function  /(x  )  is 

ds'in  ■  V)(n'  •  V'GR (x.  x'))/(x')  (57) 

or 


J  ds'[k2(n  ■  n)GR[x.  x)  -  (n  x  V)  •  (n  x  V'G*(x.  x'))]/(x').  (58) 

where  the  second  form  follows  from  Eq.  (29).  As  in  the  2D  case,  we  implicitly  assume  a 
limiting  procedure  whereby  the  field  point  approaches  its  final  destination  on  the  surface 
from  off  the  surface.  The  first  term  in  brackets  is  only  singular  like  1  / r :  we  already  know 
how  to  deal  with  such  expressions.  It  is  the  second  term  that  requires  further  attention. 
Write  this  term  in  component  form  using  the  Levi-Civita  tensor  eUk  and  manipulate  the 
expression  as  shown  using  the  fact  that  x  and  x'  are  independent.  Summation  over  repeated 
indices  is  implied. 


-  J  ds'ii  n  x  V)  ■  (n'  x  V'Gs(x.  x')))f(x) 

J  ds'in  x  V'Gr(x.  x'))f(x) 

/  ds'(n  x  VGR(x.x'))f(x) 
Us  J, 

J  ds'(n  x  V'(dkGR(x.x')))  fix' 
j  ds'  n  x  V[f(x')dkGRix.x')) 
j"  d s'  dkGRix.x')in  x  V'/(x')) 


=  — (n  x  V) 
=  -6ijk»jdk 
=  -Cijkllj 

=  -€ijkHj 


+  tijkHj 

The  last  step  shows  the  result  of  integrating  by  parts.  Letting 

=  /(x  )dkGR (x.  x  ). 

we  apply  an  adjunct  to  Stokes's  theorem. 


dsih  x  V^)  =  (t  d\f 
■is  J»s 

to  the  part  of  the  first  term  inside  the  brackets,  to  set 


-eijkiij 


ds'n  x  V'(fix)dkGRix.  x)) 


d\'  f  ix)dkGR ix.  x!) 

U<)S 


(59) 

(60) 

(61) 


(62) 


(63) 


(64) 


=  -Gjk'ij 


(65) 


656 


CANINO  ET  AL. 


=  -6, ,•*/;,  T  d/;f<x')d(G*(x.x't 
J,<s 

=  -  <f  dl'  ■  (n  x  VGff(x.  x'))  /' ( x ' ) . 

Jus 

which  is  integrable.  To  evaluate  the  rest,  use  the  fact  that 

VG* (x.  x')  =  -VG* (x.x') 


to  write 


i:ii i  J  ds‘  8kGR(x.  x')(n'  x  V  f(x') 

=  -  j^ds' d‘kGRi\.  \')€uj(n  x  V/(x')),Hy 

=  -  ^ds'VGR(\.  x)  ■  [(ft'  x  V'/(x'))  x  ft]. 


At  the  field  point,  the  vector  in  brackets  becomes 


(66) 

(67) 


(68) 


(69) 

(70) 


(n  x  V'/(x'))  x  n  =  -n  x  (n  x  V'/(x'))  =  V'/(x').  (71) 

Some  notation  from  differential  geometry  is  useful  at  this  point:  dtix  =  dx/du“  is  the 
derivative  of  the  surface  with  respect  to  surface  parameter  u'1 ;  giu.  is  the  metric  tensor  given 
by  d(lx  ■  3,.x:  g"’  is  the  inverse  of  g„, :  g  is  the  determinant  of  g(n ;  and  8',  f  represents  the 
derivative  of  /  with  respect  to  //".  i.e..  8[tf  =  9/(x'(u))/3h". 

Thus,  in  the  language  of  differential  geometry,  the  vector  in  brackets  becomes 


,  x 

3'V9,x'  =  g"'3;./3>'=— £= 

vg  ( u ) 


when  a11  is  defined  as 


(72) 


v/g(u)g"'a'/ 

evaluated  at  the  field  point.  Therefore,  we  may  write 
-  j  ds'VGR(x.  x)  ■  [(n'  x  V'/(x' 


t'))  x  n] 


=  J^ds'  VGr  (x.  x')  • 

-  J  ds'  V'Gr(x.  x)  ■ 


„  ,  a"d'x' 

n  x  (n  x  V  /(x  ))+-■*  . 

v/g(u)J 

a11 8'tx' 


Vgl  u) 


The  first  term  is  integrable  because  the  zero  of 


a,ld'x' 

n  x  (n  x  V  /(x  ))  +  —  ; 

Vg(u)  J 


(73) 


(74) 


(75) 
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at  the  field  point  cancels  one  of  the  two  poles  from  V'G*  (x.  x  )  at  the  field  point.  The  other 
term  may  be  rewritten  as 


/  ds'  VGr (x.  x')  • 

'a^d’flx" 

Is 

_  v/g(U)  _ 

=  -  J  ds"V[:GR(x.  x 


[  Vy<u) 


(76) 


i  „ R  ,  cr"  9 '  x 
•  Cs(x.x') - " 


+  o"  /  ds'GR(x.  x')V,'  • 


_9^x_ 

Vjf(u) 


(77) 


where  the  last  step  shows  the  result  of  integrating  by  parts.  The  part  of  the  first  term  in 
parentheses  has  no  normal  component  so  it  can  be  converted  to  a  boundary  integral  using 
the  divergence  theorem  for  open  surfaces  (see  Appendix  B): 


/  ,  ,  t  b  aM8'x' 

-  /  ds  V.  ■  Gr(x.  x  )  -  " 


V,?(u) 


=  -  <p  (d  1  x  n  )  ■  (  Gr(x.  x  )■  * 

Jas  V  v,?(u) 


(78) 

(79) 


The  second  term  is  zero  since  (see  Appendix  C) 


v: 


\/,?(U) 


=  0. 


(80) 


Putting  the  various  terms  together,  we  arrive  at  the  numerically  tractable  expression  for 
the  integral  needed  to  compute  local  corrections  for  the  hypersingular  component  of  the 
kernel. 


j  ds'(k2{ n  •  n')Gs(x.  x')/(x')  +  V'G*(x.  x')  • 


a^d'x 

n  x  (n  x  V  f  (x  ))  +  -  f 


V,?(u) 


Ll'-((nxVCs(x.x'))/(x)  +  (nx(/3;x'))^J,  (81) 


where 


=  \ZgWgMVdlf(x(u)).  (82) 

evaluated  at  the  field  point.  The  first  integral  is  a  surface  integral  whose  integrand  diverges 
no  worse  than  1  /  r  near  the  field  point:  the  second  is  a  boundary  integral  of  a  regular  function 
(so  long  as  the  field  point  is  never  situated  on  the  boundary). 
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A. 3.  Three-dimensional  rector. 

A. 3. a.  G(r)(t(x)  ■  t'(x')). 

This  kernel  is  identical  to  G(r)  in  the  3D  scalar  case,  except  that  the  regular  function 
with  which  it  must  be  convolved  is  the  inner  product  of  a  tangent  vector  t(x)  at  the  field 
point  and  a  tangent  vector  t'(x')  at  the  source  point.  Four  sets  of  local  corrections  must  be 
computed  for  each  field  point  since  there  are  two  independent  tangent  vectors  at  each  field 
point  and  two  at  each  source  point. 

A.3.b.  t(x)  •  (V'G(r)  x  t'(x')). 


t  •  ( V'G(r)  x  t'(x')) 

,  ■.  ..  (t'(x')  x  t(x))  ■  r 

=  (/At  -  nr - - - 


/*-  T 


reeular 


recular 


=  ik  -2 — 4i — — - -((t(x)  x  t  (x  ))  ■  r) 


(At  ) 2 


regular 


recular 


singular 


+  (cos(Ar)  +  (At)  sin(Ar)) 


((t(x)  x  t'(x'))  ■  r)  1 


sincular 


(83) 


(84) 


The  analysis  of  the  singular  component  is  as  follows.  We  can  write  t(x)  in  terms  of  surface 
derivatives  at  the  field  point 


t(x)  =  C"3//X  (85) 

with  some  pair  of  coefficients  p  =  1. 2.  Letting  u'  denote  the  parameterization  of  the 
source  point  relative  to  the  field  point,  we  can  write  the  expansions  for  t’(x')  and  r(x')  about 
the  field  point. 


t'(x')  =$pd'0x  =  ^°  (dpx  +  drd„\ u'a  +  •••).  (86) 

for  some  other  pair  of  coefficients  ^  with  p=  1.2  and 


r(x')  =  dTxi('r  +  ■■■. 


(87) 


Then 

((t(x)  X  t'(x'))  •  r)  =  C'V (9„x  X  dpx  +  dflx  x  dpd„xu'n  +  •••)•  ( drxu'T  +  •  •  • )  (88) 

=  C,'I^((9/1X  x  dpdnx)  ■  dTX)it’nit'1  H - .  (89) 

Since  the  leading  term  in  l/;-2  is  also  second  order  in  u'.  the  ratio  ((t(x)  x  t'(x'))  •  r )/r2  does 
not  diverge  in  the  limit  as  r  -*  0.  However,  like  the  factors  (n'  •  r)/r2  and  (n  •  r )/r2  from  the 
3D  scalar  case,  this  ratio  is  not  a  regular  function  unless  the  principal  radii  of  curvature  at 
the  field  point  are  identical.  Computation  of  local  correction  integrals  for  each  combination 
of  tangent  vectors  at  the  field  and  source  points  proceeds  as  in  the  corresponding  3D  scalar 
case. 
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A.3.c.  (t(x)  •  V)(V'G(r)  •  t'(x’)). 


(t(x)  •  V)(V'G(r)  •  t'(x')) 

..  .<  (  1  i kt  \  :ir  ,  ( k-)~  -f-  3/ kf  —  3  \ 

=  (t-t )[ - ; —  )elkl  +  (t  •  r)(t  -r) -  '  ~,kr 


(90) 


=  it 


regular 


regular 


7“?re-e“lar  ^kn  regular 

(— - cosikr))^  ,  ,,  kr  *{  T77 - - 

- ITT - "-"  +  i - o7? - ,,rM1  r 


regular 


+  (t-  V)(V'CV)-t'). 
^ ^ 
hypersingular 


(91) 


The  result  is  very  similar  to  that  in  the  3D  scalar  case.  The  real  part  of  G(r).  namely 
GR(r)=  cos (fcr)/r.  produces  a  hypersingular  term  that  is  not  (in  general)  integrable  be¬ 
cause  it  diverges  like  \/r'  relative  to  the  field  point.  We  now  show  how  to  manipulate  it 
into  a  form  that  can  be  evaluated  numerically  when  the  region  of  integration  contains  the 
field  point. 

Reformulating  the  integral  of  the  hypersingular  term  begins  with  an  integration  by  parts: 


J ds‘( t(x)  •  V)(V'Gfi(x.  x')  •  t'(x')) 

=  J  ds't\x)  •  V'(t(x)  ■  VGfi(x.  x')) 
=  ds'  V'  •  [t'(x')(t(x)  •  VG*(x.  x'))] 


-  j  ds'( t(x)  •  VG#(x.  x'))(Vj  •  t'(x')). 


(92) 


(93) 


The  first  term  on  the  last  line  can  be  convened  to  a  boundary  integral  using  the  divergence 
theorem  for  open  surfaces  (see  Appendix  B)  and  the  fact  that  the  argument  of  V'-  is  tan- 
gential  to  the  surface: 


ds'  Vj  •  [t'(x')(t(x)  •  VC*(x.  x'))]  =  <f  d!(e  •  t'(x'))(t(x)  •  VGff(x.  x' 
•  Jas 


)).  (94) 


The  second  term  is 


-  ^  ds\ t(x)  •  VGff(x.  x'))(V||  •  t'(x'))  =  ds'  VGr(x.  x)  ■  [t(x)(V'  •  t'(x'))].  (95) 
Write  this  as 


ds'  V'Gr(x.  x')  • 


,  ,  ,  a"3'x' 

t(x)(V,  - 1  (x  )) - -~= 

x/i(u T. 


+  J  ds'  V'G*(x.  x') 


Vg(u)J 


(96) 


660 


CAN1NO  ET  AL. 


where  the  constant  cr"  is  chosen  to  make  t(x)(  V  •  t'(x'))  and  a“ d'(lx  / y/giu)  equal  at  the 
field  point.  In  other  words,  cr"  is  defined  as 


\/'g{u)g,n  (t(x>  •  3'x')(V'  •  t'(x'))  (97) 


evaluated  at  the  field  point.  The  first  term  is  integrable  because  the  zero  of 


t(x)(V'  •  t'(x')) 


v/iluT. 


(98) 


at  the  field  point  cancels  one  of  the  two  poles  from  V'Gfi(x.  x  )  at  the  field  point.  As  shown 
in  the  3D  scalar  case,  the  second  term  reduces  to  the  boundary  integral: 


ds'VGK(x.x )• 

_ l 

Js 

.  y/gW^ 

Gr(\.  x) 


v/gTu)  / 


(99) 

(100) 


Putting  the  various  terms  together,  we  arrive  at  the  numerically  tractable  expression  for 
the  integral  needed  to  compute  local  corrections  for  the  hypersingular  component  of  the 
kernel. 


ds'  V'Gr(x.  x') 


t(x)(V,'  •  t'(x'))  - 


a'ld'ltx" 
s/glu)  _ 


+  I  dl'  e  ■  ( (t(x)  ■  VG*(x.  x')H'(x')  +  GK(x.  x')^-J==r\ 
J:is  V  v/g(u)  / 


(101) 


where 


a “  =  y/gwu^'wx)  ■  a;.x')(v;  -  t'<x')>  =  ^lui/’dix)  •  d'rx,){gfiod'l,t’  •  a;x').  <  102) 

evaluated  at  the  field  point.  The  first  integral  is  a  surface  integral  whose  integrand  diverges 
no  worse  than  l/r:  the  second  is  a  boundary  integral  of  a  regular  function  (so  long  as  the 
field  point  is  never  situated  on  the  boundary). 

If.  as  suggested  in  Section  III.C.3.  the  /uth  tangent  vector  at  the  field  point  (with  surface 
parameter  u0)  is  given  by 


t„(u)  =  3„x(u)  (103) 

and  the  vth  vector  testing  function  associated  with  scalar  testing  function  /,Al(u)  is  given 
by 


Ou) 


9,  x(u) 

Vg(u) 


/'*’(  U). 


(104) 
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then  Eq.  (101 )  simplifies  to 

j  ds'v'G*(x.x)  ■  (dflxd'vf,k\u)  -  a'x'a;./a''(uo))/vT<u) 

+  <f  dl’  e  •  (G*<x.  x)d’Jik,( u0)3>'+  (9/(x  •  VC*(x.  x'))  fa\u)d'vx)  \  J^T). 
J'dS 

(105) 


B.  Divergence  Theorem  for  Open  Surfaces 
Substitute 


B  =  nxA 


into  Stokes's  theorem 


/  ds  n  *  (V  x  B)  =  (j  d\  B 

Js  Jds 


(106) 


(107) 


to  set 


(V  x  (n  x  A)) 


=  Jsdsfi'  •  A)  -  (n  ■  V)A  -  A(V  •  n)  +  (A  ■  V)n] 
=  J  *[(VB  *  A)  -  (ft  •  A)(V  •  n)] 

=  <p  Jl-(nxA) 

JdS 


-f 

J  a< 


(dl  x  n)  •  A 


dl  e  •  A. 


i)S 


where  we  have  used  the  definition  of  tangential  gradient 

Vm  =  V  —  n(n  ■  V) 


(108) 

(109) 

(110) 
(111) 
(112) 

(113) 


and  the  following  equation  which  relates  the  vector  line  element  dl  and  the  surface  normal 
n  to  the  scalar  line  element  dl  and  the  unit  edge  vector  e. 


dl  x  n  =  die, 

and  the  observation  that 

n  •  [(A  •  V)n]  =  [(A  •  V)n]  ■  n  =  i(A  •  V)(n  •  n)  =  0. 

In  other  words,  the  divergence  theorem  for  open  surfaces  is 

/  c/s[(V||  ■  A)  —  (n  •  A)(V  •  n)]  =  1/  dl  t  ■  A  =  </  (dl  x  n)  •  A. 
•/-s  Jus  Jos 


(114) 


(115) 


(116) 
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which  simplifies  to 


/  ds(V  •  A)  =  d  die  *  A  =  <z  (dl  x  n)  •  A 

'S  7*5  As 


(117) 


when  A  is  everywhere  tangential  to  5. 


C.  Proof  that  V'  •  [3,' x'/V^u)]  =0 

Summation  over  repeated  indices  is  implied: 

3'  x' 


V  * 


a;,x' 


L  v/iTuTJ 


=  3 ' 


A<u)/  9/1 X 


Vv^iu) 

„/  3' 3'x'  3'x'  , 

=  ^  <  ~^= - =C?<u)  •  9>' 


\Vg(u)  2v/g(u)3 


/ . 3'x'  •  3'x' 

°  '  o' _ '  o' _ '  //  P 


, - ,  3'  3'x'  •  3'x'  — 

V^Tu)  V  2i?(u 


(^’(uK^^x'  •  3; 3.x') 


^jy  (3>'  •  9.9>'  -  &„se",3;x'  •  d^x) 

•  a;a>-  - 


\/g(u) 

I 


v-T5nlS'"8>'.9;i);,x-s'”8;x'-a;9;,x)=o. 
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A  Prescription  for  the 
Multilevel  Helmholtz  FMM 


Mark  F.  Gyure  and  Mark  A.  Stalzer 
HRL  Laboratories 


The  authors  describe  a  multilevel  Helmholtz  FMM  as  a  way  to  compute  the  field 
caused  by  a  collection  of  source  points  at  an  arbitrary  set  of  field  points.  Their 
description  focuses  on  the  algorithm’s  mathematical  basics,  so  that  it  can  be 
applied  to  a  variety  of  applications. 


The  fast  multipole  method  for  the  scalar 
Helmholtz  equation,  (V2  +  k2)^  =  0,  is  com¬ 
monly  used  to  compute  acoustic-  and  elec¬ 
tromagnetic-scattering  cross  sections.1,2 
Ronald  Coifman,  Vladimir  Rokhlin,  and  Stephen 
Wandzura3  described  a  single-level  scheme,  which  has 
been  implemented  in  two  and  three  dimensions  for  scalar 
and  vector  scattering  problems.4-*  The  method  has  been 
subsequently  extended  to  multiple  levels,  again  with  an 
emphasis  on  electromagnetic  scattering.7,8 

In  this  article,  we’ll  focus  on  the  basic  multilevel  FMM 
algorithm  as  a  way  to  quickly  compute  the  field  caused 
by  a  collection  of  Helmholtz  source  points  at  an  arbi¬ 
trary  field  point  To  keep  our  description  of  the  imple¬ 
mentation  simple,  we’ll  assume  that  the  field  is  desired 
at  each  source  point,  as  would  normally  be  the  case  when 
constructing  an  impedance  matrix  for  a  physical  prob¬ 
lem.  Through  this  basic,  but  detailed,  description,  we 
hope  to  make  the  multilevel  Helmholtz  FMM  more  ac¬ 
cessible  for  a  variety  of  problems. 

The  mathematical  preliminaries 

Previous  research  on  the  FMM  has  taken  two  ap¬ 
proaches.  The  first3  starts  from  the  standard  integral 
equation  for  a  field  arising  from  an  arbitrary  source  dis¬ 
tribution  assumed  to  be  localized  to  surfaces: 


.  idr-ri 


(i) 


This  approach  then  manipulates  the  integral  equation 
by  substituting  two  identities:  one,  a  form  of  the  Gegen- 
bauer  addition  theorem,  and  the  other,  a  plane  wave  ex¬ 
pansion  for  spherical  Bessel  functions.  The  result  is  an 
expression  for  Equation  1,  from  which  it  is  straightfor- 
wari d  t°  construct  an  algorithm  for  computing  the  field  in 
0(Ni/2)  operations,  where  N  is  the  number  of  unknowns 
describing  the  entire  source  distribution.  Extension  to  a 
multilevel  FMM  that  scales  as  OfAHog2  N)  is  also  possi¬ 
ble  through  this  approach. 

The  other  approach,  taken  by  Rokhlin  in  the  original 
Helmholtz  FMM  paper2  and  the  one  we  use  here,  uses 
the  language  of  multipole  expansions  that  are  valid  ex¬ 
tenor  or  interior  to  groups  containing  an  arbitrary  num¬ 
ber  of  source  or  field  points.  In  this  approach,  the  essen¬ 
tial  pomt  is  that  diagonal  transforms  exist  for  translating 
the  origins  of  both  interior  and  exterior  expansions  of 
charge  distributions  as  well  as  for  converting  exterior  ex- 
pansions  to  interior  expansions. 

Rokhlin  has  already  described  the  mathematical  details 
involved  m  constructing  exterior  and  interior  expansions 
and  has  provided  proofs  of  the  various  theorems  involv¬ 
ing  translation  operators.2  We’ll  now  provide  a  concep- 


July-September  1998 


1 070-9924/98/SI  0.00  ©  1998  IEEE 


39 


tual  framework  in  which  manipulation  of  off- 
centered  expansions  through  diagonal  transla¬ 
tion  operators  and  efficient  transforms  is  a  com¬ 
pletely  natural  way  to  view  the  FMM.  This 
approach  is  not  only  more  general,  but  also  bet¬ 
ter-suited  for  describing  the  multilevel  FMM.  In 
particular,  it  also  makes  clear  why  the  interpola¬ 
tion  and  filtering  steps  that  are  necessaiy  in  the 
multilevel  FMM  must  be  treated  carefully. 

Multipole  expansions  and  translation 
operators 

Consider  two  well-separated  spheres  of  radius 
R\  and  j?2,  each  containing  a  collection  of  points. 
We’ll  take  the  points  inside  to  be  Helmholtz 
point  sources  and  the  points  inside  R2  to  be  field 
points  at  which  we  would  like  to  evaluate  the 
field  caused  by  the  collection  of  source  m  R\ . 
This  field,  written  as  a  multipole  expansion2 
valid  outside  Rh  is 

v{r  )  =  X/U/(^WM,  (2) 

bn 

where  r,  0,  and  <j>  are  relative  to  a  coordinate  sys¬ 
tem  centered  in  Ru  b/{kr)  are  spherical  Hankel 
functions  of  the  first  kind,  and  F/w(0.  o)  are  the 
normalized  spherical  harmonics.  We’ll  refer  to 
this  expansion  as  an  exterior  or  h-expansion. 
Similarly,  we  can  write  an  expression  for  the 
field  valid  inside  R2: 

<P{r)  =  X ahJi{kr)YUe’,P) ,  (3) 

bn 

where  r,  0,  and  4>  are  now  relative  to  a  coordi¬ 
nate  system  centered  in  R2 ,  an Ajfjtr)  are  spher¬ 
ical  Bessel  functions.  We’ll  refer  to  this  expan¬ 
sion  as  an  interior  or  j-expansion.  For  the 
moment,  we  will  consider  both  of  these  to  be 
infinite  sums.  The  FMM  then  rests  on  three 
observations: 

•  The  origin  of  the  h-expansion  (Equation  2) 
can  be  shifted  arbitrarilyrinside  Ru  and  a 
new  set  of  coefficients,  plmy  can  be  com¬ 
puted  for  this  new  expansion.  The  same 
holds  for  shifting  a  j-expansion  (see  Equa¬ 
tion  3)  arbitrarily  to  a  new  origin  inside  R2f 
which  results  in  a  new  set  of  coefficients, 

•  An  h-expansion  valid  outside  R\  can  be 
translated  and  convened  into  a  j-expansion 
valid  inside  R2,  resulting  in  a  new  set  of  co¬ 
efficients  for  the  j-expansion,  y^,. 

•  Most  crucial,  these  translations  can  be  done 
efficiently  by  transforming  the  coefficients 


into  a  basis  in  which  both  translation  oper¬ 
ators  are  diagonal.  We’ll  illustrate  this  be¬ 
low  by  constructing  a  diagonal  form  for  the 
h-expansion  translation  operator.  The 
FMM,  with  one  or  multiple  levels,  is  now 
basically  a  sequence  of  combinations  and 
translations  of  multipole  coefficients  re¬ 
sulting  in  an  expansion  for  the  field  that  can 
be  easily  evaluated  at  any  point  inside  an¬ 
other  group. 

Generalized  addition  theorems  for  partial 
wave  expansions  and  their  corresponding  ex¬ 
pressions  for  the  translation  of  multipole  coeffi¬ 
cients  have  been  known  for  many  vears.910 
Rokhlin,  however,  was  the  first  to  realize  that 
these  translation  operators  could  accelerate  the 
numerical  computation  of  fields  obeying  the 
Helmholtz  equation.  A  general  expression  ex¬ 
ists  for  translating  the  coefficients  of  multipole 
expansions  that  are  solutions  to  the  Helmholtz 
equation;  the  specific  forms  of  interest  here  are 

film  =  X  fit”'  X  c[lm\l’m\ pfyn  >  <4> 

tm  pq 

=  X  X  C{lm\l'm\Ptl)kM  -  311(1 

tm  pq 

7  bn  =  X  fit”'  X  C{lm\l'm\  |»K,  •  (6) 

fm7  pq 

where  c(lm  I  fm  I  pq)  is  proportional  to  the  well- 
known  3j  symbols  involving  products  of  three 
spherical  harmonics: 

c{lvt\l’m\pq)  =  dkYl(ke,kt) 

M*0'**)*vM*)-  <7) 

Following  Rokhlin,  we  will  refer  to  the  func¬ 
tions  Xpq  and  \ipq  as  translation  operators.  They 

have  the  forms 

\ pq  ~  ^njpfanY pqi^n^z)  and  (8) 

.  (9) 

In  the  above  expressions,  xUl  012,  and  <t>12  refer 
to  the  coordinates  associated  with  the  vector 
pointing  from  the  expansion’s  original  center  to 
the  new  center. 

The  problem  with  using  the  above  expres¬ 
sions  directly  in  a  computational  scheme  is  that 
an  individual  coefficient  such  as  j3Im  depends  on 
a  sum  over  all  the  original  coefficients  Prm'  and 


40 


IEEE  Computational  Science  &  Engineering 


on  a  sum  over  a  set  of  indices  associated  with  the 
translation  operator  and  3j  symbols.  Even  with 
truncation  of  the  multipole  expansion  to  a  finite 
number  of  terms,  I,  this  approach  is  not  practi- 
cal.  A  computationally  viable  scheme — that  is, 
one  that  scales  no  worse  than  0(L2)—  requires 
diagonalizing  this  transformation,  meaning  that 
each  coefficient  can  be  translated  independently 
of  all  the  others.  The  problem,  then,  is  to  find  a 
representation  in  which  this  translation  is  diag¬ 
onal.  This  representation  is  often  called  the  far- 
field  representation ,  and  the  transform  that  diag¬ 
onalizes  the  two  translation  operators  is  the 
far-field  transform . 

Following  Rokhlin,  we  define  the  far-field 
transform  and  inverse  transform  of  an  arbitrary 
function/as 

=  and  (10) 

bn 

flm=j^i~lyL(k9^)f(ke,klt>)  .  (11) 

This  is  basically  just  a  spherical  harmonic  trans¬ 
form  that  rotates  a  function  from  one  basis  to 
another  in  exact  analogy  to  a  Fourier  transform. 

Consider  the  specific  case  of  translating  an  h- 
expansion  to  a  new  origin,  which  means  trans¬ 
forming  the  set  of  coefficients  By  taking  the 

(inverse)  far-field  transform  of  a  and  X  in  Equa¬ 
tion  5,  the  far-field  transform  completely  diago¬ 
nalizes  the  transformation  of  the  as — that  is, 

&im  =  J  dkr'Y^kg,^  Mke,k^)(x(ke,k^ 

(12) 

or,  equivalently  through  a  far-field  transform  of 
Equation  12, 

a(*8>*,)  =  2.(*o,  (13) 

Even  more  useful  computationally  is  that  the  in¬ 
verse  transform  X^  simplifies  to 

^(k$’k<p) 

=Z^(^,v)4^.2  KKa2) 

bn 

-***12  co  sy  ,(14) 

where  y  is  the  angle  between  (012,  <f>12)  and  (£e, 
£$).  Because  X  is  also  the  translation  operator  for 
j -expansions,  the  same  analysis  applies  to  the 
translation  of  interior  expansions. 

The  translation  operator  X  represents  a  “lo¬ 


cal”  shift  in  the  group  center,  retaining  the  ex¬ 
terior  or  interior  expansion.  The  translation  of 
an  h-expansion  into  a  j-expansion  is  through  the 
translation  operator  \i,  which,  in  the  far-field  ba¬ 
sis,  has  a  similar  form  to  X, 

= X  ‘‘i21 + fyifaM0 °s  r),  (is) 

/ 

but  with  considerably  different  mathematical 
behavior.  The  translation  operator  p  is  qualita¬ 
tively  different  than  X  in  that  no  simpler  expres¬ 
sion  exists.  In  feet,  the  infinite  sum  diverges,  and 
the  mathematical  consequences  of  this  diver¬ 
gence  require  careful  attention  in  a  rigorous 
treatment  of  the  FMM.  But,  a  numerical  imple¬ 
mentation  that  uses  truncated  multipole  expan¬ 
sions  needs  only  a  finite  number  of  terms  to 
achieve  a  given  accuracy  in  the  translation.2 
Hence,  the  divergence  of  the  infinite  sum  has 
no  practical  consequences. 

So  far,  our  description  of  multipole  expan¬ 
sions  and  translation  operators  has  not  covered 
two  significant  issues.  We  haven’t  discussed  any 
of  the  theorems  that  prove  that  the  multipole 
expansions  themselves  converge  to  a  specified 
accuracy  in  a  number  of  terms  approximately 
proportional  to  the  group  radius.  Also,  we 
haven’t  discussed  truncation  of  the  series  for  the 
h-to-j  translation  operator,  ji.  These  issues  are 
important  in  numerical  implementation  because 
the  algorithm’s  accuracy  depends  critically  on 
the  number  of  terms  kept  in  these  series.  How¬ 
ever,  Rokhlin  has  already  adequately  addressed 
these  issues.2 

The  above  expressions  for  translation  opera¬ 
tors,  together  with  the  far-field  transform,  are 
the  basic  tools  used  to  construct  a  multilevel 
FMM  algorithm.  Clearly,  the  field  caused  by  a 
collection  of  sources  inside  an  arbitrary  group 
Gi  can  be  evaluated  at  any  point  inside  a  second 
group  G2  by  converting  the  exterior  h-expan¬ 
sion,  valid  outside  Gj,  to  an  interior  j-expansion 
valid  inside  G2.  We  can  translate  the  coefficients 
of  the  j-expansion  to  any  point  inside  G2.  Also, 
we  can  calculate  the  field  at  that  point  caused  by 
the  sources  in  Gj  by  computing  Oq 0,  the  leading 
term  in  the  j-expansion.  No  other  terms  con¬ 
tribute,  because  the  expansion  is  already  cen¬ 
tered  at  the  field  point  where  r  =  0  and  all  the 
terms  jim(0)  are  zero  except y0o>  which  is  one. 
Thus,  we  can  evaluate  the  field  directly  through 
the  far-field  transform  as 

*(0) =  ^oo  ~  7—  J  j . 

V4;r  (16) 
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Interpolation  and  filtering 
One  crucial  issue  remains  in  constructing  an 
efficient  multilevel  algorithm  that  scales  prop¬ 
erly.  The  multilevel  Helmholtz  FMM  works 
fundamentally  the  same  as  the  Laplace  FMM  in 
that  it  combines  expansions  valid  inside  the  orig¬ 
inal  groups  to  form  expansions  valid  inside  cor¬ 
respondingly  larger  groups  with  a  bigger  group 
radius.  This  recursive  regrouping  results  in  a 
tree-like  structure  that  has  groups  of  different 
sizes  at  different  levels  of  the  tree,  h-  or  j- 
expansions  valid  for  groups  at  one  level  must  be 
combined  to  form  expansions  valid  for  either 
larger  or  smaller  groups  at  a  different  level. 
More  specifically,  h-expansions  from  neighbor¬ 
ing1  groups  are  translated  and  combined  into  a 
single  h-expansion  representing  a  larger  group 
when  going  up  the  tree,  and  j-expansions  in  a 
large  group  are  translated  to  smaller  groups  go¬ 
ing  down  the  tree.  Let’s  look  at  these  two  oper¬ 
ations  in  more  detail. 

When  combining  smaller  groups  into  a  larger 
group,  the  number  of  coefficients  in  the  h- 
expansions  representing  each  of  the  smaller 
groups  must  increase  to  preserve  the  accuracy  of 
the  source  expansion  after  the  coefficients  are 
translated  and  combined  at  the  new  (larger) 
group  center.  This  is  a  consequence  of  translat¬ 
ing  the  h-expansions  to  origins  that  are  further 
away  than  what  was  allowed  by  the  number  of 
terms  in  the  original  expansions.  In  terms  of 
multipole  coefficients,  this  operation  is  handled 
by  adding  higher-order  coefficients,  initially 
zero,  and  then  translating  the  expansion.  The 
translation  mixes  the  multipole  coefficients  so 
that  the  higher  modes  are  nonzero  after  the 
translation.  This  new  expansion  can  be  com¬ 
bined  with  others  being  shifted  to  the  same 
group  center  by  simply  adding  their  coefficients 
term  by  term.  The  problem  with  implement:  ng 
this  procedure  is  that  the  translation  operator 
must  be  applied  in  the  diagonal  far-field  repre¬ 
sentation,  not  the  multipole  coefficient  repre¬ 
sentation,  for  the  reasons  we  described  in  the 
previous  section.  In  the  far-field  basis,  the  addi¬ 
tion  of  higher-order  multipole  terms  that  are 
zero  amounts  to  an  interpolation  of  the  function 
P(^e>  £(t>)  onto  a  denser  set  of  far-field  directions 
(^e  t  ^0-  This  interpolation  must  not  introduce 
spurious  high-order  multipole  terms;  otherwise, 
the  algorithm’s  accuracy  is  quickly  compromised. 

A  similar  problem  exists  when  translating  the 
j-expansions  of  larger  groups  to  the  centers  of 
smaller  groups,  a  procedure  that  is  required 
when  going  down  the  tree.  Because  a  smaller 


number  of  multipole  terms  are  needed  to  rep¬ 
resent  the  field  inside  a  smaller  group,  the  num¬ 
ber  of  terms  in  the  multipole  expansion  can  be 
decreased  with  no  loss  of  accuracy.  In  the  far- 
field  representation,  this  procedure  amounts  to 
filtering  the  function  a(k&  k$)  to  a  less  dense  set 
of  far-field  directions  (*e#  kj.  But,  just  as  in  the 
interpolation  step  described  above,  the  filtering 
operation  must  remove  only  the  higher-order 
multipole  coefficients;  otherwise,  the  accuracy 
is  similarly  compromised. 

The  implementation  of  fast,  efficient  interpo¬ 
lation  or  filtering  operations  is  straightforward 
in  principle.  Because  the  translation  operators 
are  diagonal  in  the  far-field  basis,  all  FMM  im¬ 
plementations  keep  the  h-  and  j-expansions  ex¬ 
clusively  in  the  far-field  representation.  The  in¬ 
terpolation  and  filtering  steps,  however,  are 
rigorously  defined  only  in  a  multipole  coeffi¬ 
cient  basis. 

Consider  interpolating  an  h-expansion  given 
by  a  set  of  coefficients  in  the  far-field  represen¬ 
tation  (3(0,  <f>).  The  multipole  coefficients  are 
given  by  this  far-field  transform: 


A.  =  /  4cos*9)P/"(cos*e)J  dke-mk>p(ke,kt) 
=  jd(coske  ]P,m  (cos  k9)pm(ke)- 


We  have  left  out  phase  and  normalization  fac¬ 
tors  in  Equation  1 7.  Because  filtering  or  inter¬ 
polation  always  involves  a  transform-inverse 
pair,  we  consider  these  factors  as  being  absorbed 
into  the  definition  of  the  multipole  coefficients. 
Assuming  a  uniform  distribution  of  points  in  the 
direction  on  the  unit  sphere,  a  fast  Fourier 
transform  (FFT)  can  easily  and  efficiently  com¬ 
pute  Pw(*9). 

Numerical  quadrature  handles  the  remaining 
part  of  the  transform: 


An,=X^m(cOS^)^eJ(  (lg) 

n  =  1 

where  wn  and  are  sets  of  weights  and  abscis¬ 
sas  for  an  appropriately  defined  quadrature  rule. 
Interpolation  onto  the  denser  set  of  points  is 
then  handled  by  the  inverse  transform 


,(19) 

v  m»-Z/  /= 0  1s 

where  U  >  L,  the  far-field  directions  (k$t  kf)  are 
now  a  correspondingly  denser  set  of  points  on 
the  unit  sphere,  and  all  the  p/„  corresponding  to 
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1>L  are  zero.  The  filtering  step  happens  in  ex¬ 
actly  the  same  way,  except  that  U  <  L  and  the 
coefficients  corresponding  to  l>  U  are  simply 
truncated.  In  both  cases,  an  FFi  can  handle  the 
sum  on  m  straightforwardly 

The  filtering. and  interpolating  steps  will 
quickly  create  a  serious  computational  bottle¬ 
neck  and  break  the  scaling  of  the  entire  algo¬ 
rithm  if  not  treated  properly.  Indeed,  the  pri¬ 
mary  obstacle  to  constructing  a  practical 
multilevel  FMM  has  been  proper  handling  of 
these  steps. 

For  this  problem,  the  best  solution— desirable 
because  it  is  exact— is  to  construct  a  fast  associ¬ 
ated  Legendre  transform.11’12  When  combined 
with  an  FFT  of  the  $  directions  on  the  unit 
sphere,  this  approach  results  in  an  operation 
count  that  scales  no  worse  than  0(Lr  log  L), 
where  L  is  the  order  of  the  spherical  harmonic 
expansion  at  a  given  level.  This  method  works 
in  principle  but  suffers  from  a  large  crossover 
point  compared  to  the  “semifast”  transform, 
which  also  uses  the  FFT  in  the  direction  but 
uses  a  slow  transform  in  the  kQ  direction,  and 
which  scales  as  1} .  Unfortunately,  this  crossover 
point  is  squarely  in  the  region  encountered  by 
problems  of  large  but  practical  size.  Recent  work 
has  improved  this  crossover  point  somewhat,12 
and  we  are  using  the  improved  algorithm  for 
higher  levels  of  the  multilevel  FMM,  which 
we'll  describe  next. 

Implementation 

Our  multilevel  FMM  implementation  consists 
of  two  main  routines:  setup  and  apply. 
Setup  produces  a  tree  or  hierarchy  of  groups 
that  partition  the  sources.  It  uses  this  tree  to  pre¬ 
compute  the  translation  operators  and  other 
quantities.  Using  information  computed  by 
setup,  apply  forms  Z  •  /,  the  value  of  the  field 
at  every  source  caused  by  all  other  sources. 


Figure  *1.  Multilevel  FMM  grouping.  The  small  box  A  interacts  with 
the  dark  shaded  region,  using  the  level-0  translation  operators.  At 
the  next  higher  level,  the  medium  box  B  interacts  with  the  medium 
shaded  region,  and  so  on  for  the  large  box  C.  in  general,  for  a  low- 
accuracy  solution  (t,  ~  k 0  Df),  a  box  interacts  with  27  other  boxes 
(In  3D)  through  translation  operators.  The  eight  small  boxes  clos¬ 
est  to  A  are  handled  directly. 


active  subboxes.  However,  because  a  surface  is 
generally  being  discretized,  the  number  of  ac¬ 
tive  subboxes  is  usually  closer  to  four.  This 
grouping  process  continues  until  all  the  sources 
fit  in  one  box.  The  quantity  His  set  to  the  num¬ 
ber  of  levels  or  height  of  die  tree,  and  the  top¬ 
most  level  is  H—  1.  The  set  of  groups  at  a  given 
level  is  denoted  groups(l). 

The  translation  operators  at  each  level  will 
have  the  same  number  of  terms  L/  and  far-field 
directions  Kj  because  the  box  sizes  are  the  same. 
The  number  of  terms  at  each  level  is  given  by 
an  empirical  fit,3 

L/  =^0D/  +  — log^D;  +  7t)  ^  (20) 


Setup 

To  construct  the  tree,  setup  performs  the 
grouping  on  a  cubic  lattice  where  each  box  edge 
has  the  length  D/  V3  (see  Figure  1).  The  group 
diameter  D  is  picked  to  minimize  the  overall  op¬ 
eration  count  and  typically  ranges  from  0.5  to 
1.5  wavelengths.  At  the  lowest  level  (level  0),  die 
routine  assigns  each  elementary  source  to  the 
box  with  the  closest  center.  With  this  base 
grouping,  the  grouping  process  moves  on  to 
subsequent  levels.  At  each  level  /,  the  size  of  the 
boxes  doubles,  so  each  box  contains  up  to  eight 


where  d  is  the  desired  number  of  digits  and  k0  is 
the  wave  number  (and  should  not  be  confused 
with  the  far-field  directions).  If  necessary, 
setup  increases  the  number  of  terms  at  a  level 
until  that  number  is  a  product  of  small  primes. 
This  makes  the  discrete  Fourier  transforms  in 
the  interpolation  and  filter  steps  fast. 

For  each  group,  setup  constructs  two  lists: 
nearby  and  far.  For  the  top  group,  the  nearby  list 
contains  itself  and  the  far  list  is  empty.  The  rou¬ 
tine  then  starts  at  level  /  =  H—  2  and  works  down 
to  /  =  0.  For  each  group  m  e  groups (l),  it  considers 
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Figure  Z  The  interpolation-and-shlfting  step  for  moving  t  -e 
tree#  and  the  inverse  shlfting-and-filtering  step  for  movint  wn 
the  tree. 

all  groups  in  that  are  subgroups  of  gro\  i  n  the 
near  list  of  the  parent  group  of  m .  If  ■  is  far 
from  m — that  is,  >  I/, — setu  p  :ces  m 

on  farijrt );  otherwise,  it  places  m  jr  .mm). 
PCrm'  =  x„  -  X^  where  X„  is  the  .  iter  of 
group  7«  and  X^'  is  the  center  of  gr  up  m.) 
Once  setup  has  constructed  the  r  *ar  and  far 
lists  for  every  group  in  the  tree,  it  truv\:tes  the 
tree  (H  is  reduced)  so  that  the  topmost .  jvei  has 
a  reasonable  number  of  far  interaction;  iO  com¬ 
pute  (in  the  full  tree,  groups  in  the  topmost  lev¬ 
els  are  near  each  other).  Given  the  (tr  :: cared) 
tree  with  the  near  and  far  lists,  setup  c~n  pre¬ 
compute  all  the  translation  operators  and  other 
needed  quantities,  and  the  routine  is  complete. 


App,y 

To  form  Z  •  I — that  is,  to  apply  the  multilevel 
FMM  operator  to  the  source  vector — apply 
follows  this  procedure: 


1.  Local  to  far:  It  computes  a  representation  of 
the  field  external  to  a  group  caused  by  the 
sources  in  the  group.  For  m  g  groupsifS), 


Smk 


-i 


,aesowra(m') 


where  a  is  a  source  in  group  m ,  x*  is  its  lo¬ 
cation,  la  is  its  strength,  and  k=  kjt  is  a  fer- 
field  direction  ( k k$).  This  shifts  each 
source  to  its  group’s  center,  where  its  field 
is  accumulated  with  that  of  all  other  sources 
in  the  group. 

2.  Level-0  translatioTi:  For  m  e  groups 0),  it 

computes  gmi  =  I^€  flrim)  where 

T„m'k  =  ju(*e,  *$)  is  a  level-0  translation  op- 
erator  as  given  in  Equation  15,  with  xl2  = 

I  X*J. 

3.  Uptree  and  translation :  Working  from  level  / 
=  1  to  l  =  H  -  1,  apply  first  computes  the 


4. 


field  at  the  center  of  each  level  /  grouD 
caused  by  its  subgroups,  and  then  translates 
this  field  to  faraway  groups  and  accumulates 
the  fields  from  subgroups.  Specifically,  for 
each  subgroup  m  of  my  it  computes  smk  - 
interpolate^  k)  and  then  shifts: 


The  interpolate  step  takes  the  external  rep¬ 
resentation  of  the  m  group  (sm  k)  and  con¬ 
verts  it  into  a  representation  sm'k  valid  for 
its  parent  group  m ,  as  we  discussed  in  the 
previous  section  (see  Equation  19).  Apply 
then  shifts  the  field  sm'k  to  the  center  of 
group  m  and  sums  that  field  with  the  con¬ 
tributions  from  the  other  subgroups, 
thereby  forming  an  external  representation 
of  the  field  caused  by  all  the  sources  in  m. 
Figure  2  depicts  this  interpolation  and  shift¬ 
ing.  The  quantities  s ^  correspond  to  the 
far-field  representation  of  the  /fe  in  the  pre¬ 
vious  section.  Once  apply  has  performed 
all  the  interpolation  and  shift  steps  at  the 
level,  it  translates  the  fields,  gmk  =  e 
Tm'k  Sm'k  f°r  771  e  grwpsi?),  using  translation 
operators  for  level  /.  The  quantities  gmk  are 
the  far-field  representation  of  the  as. 
Downtree :  Working  from  level  /  =  H  -  1  to  / 
=  1,  apply  shifts  the  field  from  each  group 
at  level  l  to  its  subgroups  and  converts  it  to 
the  subgroup  representation.  Specifically, 
for  m  g  groupsif)  and  m  a  subgroup  of  my  it 
shifts 


Smk 


and  then  filters:  gm'k  =  filter(g'mk)  (see  Fig¬ 
ure  2). 

5.  Far  to  local :  Each  lowest-level  group  now 
has  the  field  caused  by  all  far-away  groups. 
Apply  computes  the  effect  on  each  point 
in  each  group: 


B-=yLkw^”ie 


-'MX*,-*,) 


for  m  g  groups^ 0)  and  a  g  sources{m)y  where 
wk  is  the  quadrature  weight  for  the  sphere 
rule.  This  corresponds  to  the  integral  over 
the  far-field  directions  in  Equation  16.  The 
routine  forms  the  quadrature  weights  from 
the  product  of  a  Gauss  Legendre  quadra¬ 
ture  rule  (with  Lq  abscissas)  in  the  0  (polar) 
angle  and  a  trapezoidal  rule  (with  2L0  ab¬ 
scissas)  in  the  </>  (azimuthal)  angle. 

6.  Direct.  Apply  directly  computes  the  lowest- 
level  interactions  that  are  too  close  for 
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FMM:  B,  =  Ba  +  J,€  G(xa,  xj  I,  for 

m  e  groups( 0)  and  m  €  nearivi). 

The  result  is  B  with  the  accuracy  specified  by 
the  translation  operators. 

Time  complexity 

Consider  a  uniform  discretization  of  a  simple 
convex  shape,  such  as  a  sphere,  having  N  points. 
The  number  of  groups  at  the  lowest  level  is  M$ 
«  N/Dq2.  Let  Dq  be  one  so  that  the  groups  are 
roughly  the  size  of  a  wavelength;  then  M0  =  0(N). 
For  a  low-accuracy  solution,  the  number  of  terms 
at  level  /  is  Lt  -  k^D{ = k0  2l.  The  branching  factor 
of  the  FMM  tree  for  a  surface  is  four,  so  the  num¬ 
ber  of  groups  at  a  level  isM/  =  4 WM0.  The  total 
number  of  levels,  H \  is  then  given  by  M0  =  4W~I, 
assuming  a  full  tree.  For  H  >  2,  we  have  H  =  1  + 
log4  Mo  and  thus  H  =  0( log  A/). 

So,  the  times  for  the  steps  in  apply  are  as 
follows: 

•  Local  to  fan  Tlf=N2  L02  =  2k02N=  0(N) 
because  the  number  of  far-field  directions 
at  a  level  is  K\  =  2  L2.  The  time  for  far  to  lo¬ 
cal  is  the  same. 

•  Translation :  a  group  interacts  with  27  far¬ 
away  groups  (see  Figure  1),  which  gives 

H- 1 

Tt  =  X 21M!2L)  =  =  0(iVlogjV) . 

(21) 

•  Downtree:  To  filter  a  single  group  from  level 
/to/-  1  requires  Lt  FFTs  of  length  2Lh 
FFTs  of  length  2IM,  and  21/.,  ID  FMMs 
of  length  L{.  Recalling  that  each  parent 
group  must  filter  down  to  four  subgroups, 
summing  over  all  the  levels  gives 

H- 1 

Tj  =  ^  4  Mt{c  aLt  2L{  log  2Ll  +  cp  2Ll_lLi  log  Ll  + 

M 

2/-/-1  l°g  2L/_j )  ^22) 

=  8M0^y(ca(/  +  log2^)  + 

/=1 

h/2  +  ^/4)(/  +  log^)) 
(23) 

Td  =  0{n  log2  N  j ,  (24) 

where  ca  and  cp  are  the  proportionality  con¬ 
stants  for  the  FFT  and  ID  FMM.  The  ef¬ 
fort  in  shifting  is  negligible.  Uptree  has  the 
same  order  of  complexity. 

•  Direct :  Each  lowest-level  group  has  eight 


nearby  groups  where  interactions  must  be 
handled  directly  (see  Figure  1).  So,  each 
source  has  a  fixed  amount  of  work  in  the  di¬ 
rect  interaction  that  does  not  grow  with 
problem  size,  giving  a  complexity  of  0(N). 

Therefore,  the  overall  scaling  for  the  multi¬ 
level  FMM  is  0(Nlog2  N).  For  higher-accuracy 
solutions,  L0  increases,  but  Li<2lLo  for  /  >  0,  so 
the  0(N  log*  N)  scaling  is  an  upper  bound  for 
any  reasonable  accuracy. 

Memory 

The  memory  required  scales  as  0(N  log  A/). 
A  variety  of  techniques  can  lower  the  prefactor. 
First,  because  of  the  grouping,  at  a  given  level 
only  a  few  discrete  distances  and  orientations  re¬ 
quire  translation  operators.  It  pays  to  keep  a 
cache  of  translation  operators  indexed  by  level, 
group  separation  and  the  cosines  that  the 

group  separator  makes  with  two  far-field  direc- 
tions,  (X^'  •  kx)  and  ( X ^  •  k2).  Before  setup 
computes  a  translation  operator,  it  searches  the 
cache  to  see  if  the  operator  has  been  previously 
computed.  This  results  in  a  substantial  com¬ 
pression  of  the  operator,  as  weTl  show  in  the 
next  section. 

Each  level  has  only  eight  distinct  sets  of  shift 
coefficients,  which  can  be  precomputed  and 
stored.  However,  the  lowest  level,  where  individ¬ 
ual  sources  are  shifted  to  group  centers  and  back 
(Steps  1  and  5),  has  as  many  coefficients  as  there 
are  sources  times  the  number  of  far-field  direc¬ 
tions.  Precomputing  these  coefficients  is  unnec¬ 
essary  because  they  are  simple  exponentials.  In¬ 
stead,  the  coefficients  can  be  computed  as  needed, 
once  per  apply.  The  cost  of  doing  this  can  be 
amortized  over  several  simultaneous  operator  ap¬ 
plications.  This  corresponds  to  solving  for  multi¬ 
ple  right-hand  sides  using  a  blocked  iterative 
solver,  which  is  a  common  practice.  Similarly,  the 
kernel  evaluations  for  the  direct  interactions  (Step 
6)  can  be  computed  as  needed  to  save  memory. 

Results 

We  implemented  apply  in  C++  and  ran  it  on 
an  IBM  RS6000/590  workstation.  We  used  the 
highly  optimized  FFTW  package  for  discrete 
Fourier  transforms13  and  ID  FFM  routines  for 
filtering.11,12  Table  1  shows  the  apply  time  per 
right-hand  side  and  the  memory  requirements 
for  spheres  of  increasing  sizes  and  selected  ac¬ 
curacies  discretized  by  picking  points  randomly 
on  the  surface.  Figure  3  plots  the  times  with 
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Table  1,  Runtime  and  memory  requirements  for  apply,  for  various  problem  sizes  and  accuracies. 


Points 

Area  A2 

Time  (seconds) 
for  different  accuracies 

2  3  4 

Memory  use(bytes) 
for  two-digit  accuracy 

314 

1.26x10* 

8.30x10*’ 

1.29  x  10° 

1.53x10° 

1 .4  x  1 0° 

2,827 

1.13  x  10J 

2.17x10° 

3.17x10° 

4.51  x  10° 

1.7x10* 

7,853 

3.14  xIO3 

8.49x10° 

1.21  x  10’ 

1.59x10’ 

4.9x10* 

15,393 

6.16  x103 

2.11  x  101 

2.88  xIO1 

3.74x10’ 

1.0x10® 

31,415 

1.26  xIO4 

4.58x10’ 

6.24x10’ 

8.07x10’ 

2.1  x  10® 

70,685 

2.83  x104 

1.28x10* 

1.69x10* 

2.11  x  10* 

4.9x10® 

125,663 

5.03  xIO4 

2.40x10* 

3.19  x  10* 

3.93  x  10* 

8.6  x  10® 

196,349 

7.85  xIO4 

3.92x10* 

— 

— 

1.4x10’ 

least-squares  fits  to  the  time  complexity.  For  the 
two-digits  case,  the  fit  is 

T\N)  =  1.36  x  10"5  Nlog2  N.  (25) 

The  point  at  which  apply  starts  to  perform 
faster  than  a  dense-operator  application  is  ap¬ 
proximately  25,000  unknowns.  This  assumes  a 
sustained  floating-point  rate  of  100  Mf.v)ps  per 
second  and  no  penalty  for  using  the  out-of-core 
techniques  required  to  handle  extremely  large 
matrices.  Table  2  shows  the  times  for  each  algo¬ 
rithm  step  for  the  3 1,41 5 -unknowns  problem. 

We  measured  the  effect  of  the  translation  op¬ 
erator  cache,  for  the  196, 3 49 -unknowns  prob¬ 
lem  at  two-digit  accuracy.  On  average,  each 
level-0  translation  operator  is  used  3,512  times; 
each  level- 1  operator  is  used  1,056  times;  each 
level-2  operator  is  used  290  times;  each  level-3 
operator  is  used  77  times;  and  each  level-4  op¬ 
erator  is  used  14  times.  The  lowest  levels  use 
each  operator  many  times  because  group  pairs 
have  many  opportunities  to  be  in  the  same  rela¬ 
tive  orientation  and  distance.  Higher  levels  have 
fewer  groups  and  hence  less  potential  for  reuse. 

Overall,  the  multilevel  FMM  memory  re¬ 
quirements  are  dramatically  less  than  that  re¬ 
quired  by  a  dense  matrix.  For  the  196,349- 
unknowns  problem  at  two-digits  accuracy,  the 
FMM  requires  approximately  1.4  Gbytes,  com¬ 
pared  with  the  616  Gbytes  for  a  dense  matrix 
(assuming  double  precision).  This  represents  a 
savings  of  more  than  a  factor  of 400. 

The  algorithm  we’ve  described  can  be 
used  to  compute  acoustic  scattering 
with  Dirichlet  boundary  conditions 
using  a  point-based,  or  Nystrom,  dis¬ 


cretization. 14  The  only  additions  required  are  that 
the  far-to-local  step  must  incorporate  the  Ny¬ 
strom  quadrature  weights  and  that  the  kernel  val¬ 
ues  in  the  direct  computation  must  be  corrected 
by  an  appropriate  scheme  to  accurately  treat  the 
kemeFs  singular  nature.  Many  other  important 
issues  exist,  such  as  the  choice  of  integral- 
equation  formulation,  appropriate  discretizations, 
and  the  iterative  solver  and  preconditioner.  But 
these  are  all  independent  of  the  FMM. 

An  extension  to  electromagnetic  scattering  or 
using  a  patch-based  (Galerkin)  discretization  can 
be  copied  right  from  the  single-level  scheme'’ 
because  the  multilevel  translation-operator  ma¬ 
chinery  is  independent  of  boundary  conditions 
and  discretizations.  ♦ 
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loSfe  “  f'asona,fom,f°r  translating  far-fteld  expansions  to  use  i, 

lot, -frequency  fast  mult, pole  methods.  Their  approach  combines  evanescent  and 
propagating  plane  waves  to  reduce  the  computational  cost  ofFMM  implementation 


Many  problems  in  acoustics,  microwave  fil¬ 
ter  design,  interconnect  modeling,  and 
electromagnetic  scattering  require  the 
solution  of  the  Helmholtz  equation  (see 
figure  1).  To  simplify  the  ensuing  discussion,  we  limit 
our  attention  to  the  discrete  JV-body  problem  (see  Fig¬ 
ure  1,  Equation  4).  The  numerical  difficulty  herels 
clear;  direct  calculation  of  the  sums  in  Equation  4  at 
each  point  requires  0(N2)  work,  rendering  large-scale 
calculations  impractical.  To  overcome  this  obstacle,  fast 
multipole  methods  have  been  developed  over  the  last 
decade  that  reduce  the  operation  count  to  0( N)  for  co 

7j/L°TAeqUenCy  scattering)  and  O(NlogN)  for  co  - 
TV  (high-frequency  scattering). 1-9  Still,  in  the  3D  case 
the  constant  implicit  in  the  0(N)  notation  is  quite  large’ 
especially  for  high  precision  in  the  low-frequencv 
regime.  3 

FVe  present  the  analytic  foundations  for  a  new  version 
of  the  fast  multipole  method  for  the  scalar  Helmholtz 
equation  in  the  low-frequencv  regime.  The  computa¬ 
tional  cost  of  existing  FMM  implementations,  is  domi¬ 


nated  by  the  expense  of  translating  far-field  partial  wave 
expansions  to  local  ones,  requiring  189/>4  or  189p3  oper¬ 
ations  per  box,  where  harmonics  up  to  order  fyhave  been 
retained.  By  developing  a  new  expansion  in  plane  waves, 
we  can  diagonalize  these  translation  operators.  The  new 
low-frequency  FMM  (LF-FMM)  requires  40p2  +  6 fy  on- 
erations  per  box.  r  y 


1  T  \n.  6 a  version  on 
FMM  recently  developed10-11  for  the  Laplace  equati 
(co  -  0),  which  replaces  the  classical  multipole  expansi 
with  a  representation  in  terms  of  evanescent  plane  wai 
to  diagonalize  certain  translation  operators.  It  bears  soi 
resemblance  to  the  FMM  for  the  Helmholtz  equati 
Vladimir  Rokhlin  developed,1'3  which  uses  an  expansi, 
in  terms  of  propagating  plane  waves  to  diagonalize  trar 
lation  operators.  The  latter  method,  which  we  will  ref 
to  as  the  high-frequency  FMM  (HF-FMM),  is  numei 

mlA^nStable  3t  subwavelen&til  spatial  scales.  The  L. 

i  we  present  uses  a  combination  of  evanescent  ai 
propagating  modes  and  blends  the  FMM  and  HF-FMi 
together  seamlessly. 
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AO  +  ar<J>  =  /in  ficR! 

(1) 

90  .  __ 

——  +  a< t>  =  g  on  dQ., 
on 

(2) 

- ;axt>)  — >  0  as  r  — >  oo 

(3) 

where  0.  is  an  exterior  domain  and  3Q  is  its 
boundary.  In  applying  integral-equation  methods 
to  Equations  1  and  2,  we  must  repeatedly  evalu¬ 
ate  sums  of  the  form 

.v 

Wx*)  X?;'||  II 

(4) 

Because  we  will  always  be  working  with  Hankel 
functions  of  the  first  kind,  we  will  use  h„(r)  as  an 
abbreviation  of  bj]\r).  In  particular, 

eiUr 

b0(O)r)  =  - -  • 

nor 

Theorem  1:  multipole  expansion 

Suppose  that  J  sources  of  strengths  [qr  j  =  1, 
...,7)  are  located  at  the  points  (x,  =  (pp  ^  /J;),  j 
-  1*  — with  \pj  I  <  a .  Then  for  any  x  =  (?*,  6 \  0) 
e  R  with  r  >  a,  the  potential 


where  the  points  xk  are  in  R3,  because  e^/r  is  the 
free  space  Green's  function  for  the  Helmholtz 
equation  satisfying  the  Sommerfeld  radiation 
condition  (Equation  3). 


is  given  by 


Figure  1.  Solving  the  Helmholtz  equation. 


The  multipole  expansion 

We  now  briefly  define  the  multipole  (or  partial- 
wave)  expansion  due  to  a  collection  of  point 
sources  and  describe  some  of  its  properties.12-14 
We  will  need  a  variety  of  special  functions, 
whose  definitions  wre  collect  here. 


0>(x)  =  4 tMZbn(a>r)r?(e,<l>) ,  (6) 

n=0  m~-n 

where 

3 

=  I  Ij/AcoP'Wna^)  (7) 

7=1  ’ 


Furthermore,  for  any  p>m, 


W=iiW  =  -H 


=o(~y .( 8) 

r 


Definition  1 

P„(x)  denotes  the  Legendre  polynomial  of  de¬ 
gree  «,  and  P?(x)  denotes  the  associated  Le¬ 
gendre  function  of  degree  n  and  order  m.  Using 
the  Rodrigues  formula, 

jm 

P,7M  =  (-  i)"(i  -x2)m/:— — P(x)  - 

dxm 

The  spherical  harmonic  of  degree  n  and  order 
in  is  denoted  by 


W,*)- 


2)i  + 1  («  -  \m\ 

^  4;r  (n  +  |w| 


TH(c :os0)em* 


■(5) 


We  define  the  spherical  Bessel  and  Hankel  func- 
tions_/„(?-),  h„n':)(r)  in  terms  of  the  usual  Bessel 
and  Hankel  functions  via 


Note  that  for  Theorem  1,  coa  is  a  measure  of 
the  radius  of  the  enclosing  sphere  in  terms  of 
wavelengths.  Thus,  according  to  Equation  8,  the 
multipole  expansion  does  not  begin  to  converge 
until  the  number  of  terms  in  the  expansion  p  is 
of  the  same  order  as  the  number  of  wavelengths 
in  the  (smallest)  enclosing  sphere.  Once  enough 
terms  are  present,  the  error  decay  is  quite  rapid. 
Because  we  are  interested  in  the  low-frequency 
regime,  w^e  wfill  assume  that  the  first  condition 
is  always  satisfied.  If  wre  nowr  suppose  that  r  =  la 
in  the  context  of  Theorem  I,  then  Equation  8 
implies  that 


7i=0m=-n 


=  0(\)p,  (9) 


and  settings  =  log2(l/£)  yields  a  precision  £. 
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While  an  N  log  N  algorithm  can  be  con¬ 
structed  for  the  N-body  problem  based  on  only 
the  preceding  theorem,  it  performs  poorly  in 
3D.  The  FMM  relies  on  a  more  complex  analy¬ 
sis  and  uses  several  translation  operators.  Be¬ 
cause  the  details  of  such  a  scheme  have  been 
fully  described  many  times,5’6-10'15  we  will  not 
repeat  them  here.  Instead,  we  will  concentrate 
on  the  one  translation  operator  whose  cost  is 
dominant  in  existing  FMM  implementations. 

Theorem  2:  multipole-to-local  conversion 

Suppose  that  J  sources  of  strengths  qx,  q2, ..., 
qj  are  located  inside  a  sphere  of  radius  a  with  the 
center  at  the  origin.  Suppose  also  that  Q  =  (p,  a, 
P),  and  that  p  >  (c  +  1  )a  with  c  >  1 .  Then  the  mul¬ 
tipole  expansion  (Equation  6)  converges  inside 
the  sphere  Dq  of  radius  a  centered  at  Q.  Inside 
Dq,  the  potential  due  to  the  charges  qu  q2,  ..., 
qj  is  described  by  a  local  expansion: 

<J>(x)=f ]'ZLklj,(wr’)Yl\e\<t>') 

/=0*=-/  ,  (10) 

where  (r ,  O',  0')  are  the  coordinates  of  x  with 
respect  to  the  center  Q.  Furthermore,  for  any 


p>  co  a, 

-  £  £LilJi(ur)Yi{e,<p)  =  0(1)  t 
w,*-/  f  .  (II) 

For  Theorem  2,  the  matrix  that  converts  the 
multipole  coefficients  {M^}  into  the  local  coef¬ 
ficients  {L/}  is  rather  complicated,5,16  and  we 
omit  it.  We  simply  observe  here  that  the  matrix 
is  dense,  so  applying  it  to  a  truncated  expansion 
with  0(p2)  harmonics  requires  0(p4)  work. 

Although,  as  indicated  above,  we  will  not  de¬ 
scribe  the  full  3D  fast  multipole  algorithm,  it  is 
based  on  a  hierarchical  subdivision  of  space.  For 
this,  we  assume  that  all  sources  are  contained  in 
a  box  of  side  length  D,  which  we  refer  to  as 
refinement  level  0.  We  obtain  refinement  level  / 
+  1  recursively  from  level  /  by  subdividing  each 
box  into  eight  equal  parts.  This  yields  a  natural 
tree  structure,  where  the  eight  boxes  at  level  /  +  1 
obtained  by  subdividing  a  box  at  level  l  are  con¬ 
sidered  its  children.  Below  we  define  boxes  at  the 
same  refinement  level  (Definitions  2  and  3)  as 
well  as  the  interaction  list  associated  with  each 
box  (Definition  4). 


gble  f  »  *  ■***-  ,0.0  «.  .,U.  one  -  each 

^  Separated  by  at  least  one  box  in  the  ~+z  direction 

~z  llst  Separated  by  at  least  one  box  in  the  -z  direction 

-v  list  Separated  by  at  least  one  box  in  the  +y  direction  and  not  contained  in  the  +z  or  -z  lists 

L  list  Separated  by  at  east  one  box  in  the  -y  direction  and  not  contained  in  the  +z  or  -z  lists 

x  list  sparated  by  at  east  one  box  in  the  +x  direction  and  not  contained  in  the  +z,  -z,  +y,  or  -y  lists 

— xhst  Separated  by  at  least  one  box  in  the  -x  direction  and  not  contained  in  the  -z,  +y,  or  -y  ^ 
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•  Definition  2.  Two  boxes  are  said  to  be  near 
neighbor's  if  they  are  at  the  same  refinement 
level  and  share  a  boundary  point  (a  box  is  a 
near  neighbor  of  itself). 

•  Definitions  3.  Two  boxes  are  said  to  be  veil 
separated  if  they  are  at  the  same  refinement 
level  and  are  not  near  neighbors. 

•  Definition  4 .  With  each  box  i  is  associated 
an  interaction  list ,  consisting  of  the  children 
of  the  near  neighbors  of  i) s  parent  which 
are  well  separated  from  box  i. 

A  simple  counting  argument  shows  that  the 
interaction  list  contains  up  to  189  boxes.  In  the 
FMM,  the  most  expensive  step  is  converting  the 
multipole  expansion  for  each  box  into  the"  189 
different  local  expansions  that  the  boxes  in  its 
interaction  list  require.  If  there  are  M  boxes  in 
the  hierarchy,  then  this  requires  0(\89p4\I) 
work. 


Diagonal  form  of  translation 
operators 

The  new  generation  of  FAlMs  is  based  on  com¬ 
bining  multipole  expansions  with  exponential  or 
plane-w  ave  expansions.  A  complicating  feature 
of  this  approach,  however,  is  that  w^e  need  six 
different  expansions  for  each  box,  one  emanat¬ 
ing  from  each  face  of  the  cube.  The  interaction 
list  for  each  box  is  subdivided  into  six  lists,  one 
associated  with  each  direction.  Figure  2  show's 
the  +z  list  for  the  box  B,  and  Table  1  explains  the 
six  lists  for  the  interaction  list.  After  reviewing 
Table  1,  it  is  easy  to  verify  that  the  original  in¬ 
teraction  list  is  equal  to  the  union  of  the  +z ,  -c, 
+Vj  -y,  +xy  and  -x  lists. 

The  starting  point  for  our  analysis  is  the  in¬ 
tegral  representationrepresentation 

c"" 

X 

——  dadX 
VA'-w  s  (12) 

which  is  valid  for  z  >  0.  It  is  straightforward  to 
derive  from  the  3D  Fourier  transform  of  the 
kernel  e,ar/r,  followed  by  contour  integration. 
We  need  the  restriction  2  >  0  for  the  contour  in¬ 
tegral  to  be  well-defined.14  The  2D  formula  is 
given  in  the  “2D  Fourier  transform”  sidebar. 

Note  that,  for  0  <  A  <  co,  the  modes  propagate 
without  attenuation,  while  for  co  <  A  <  <»,  they 
decay.  We  refer  to  the  first  region  as  the  propa¬ 
gating  part  of  the  spectrum  and  the  second  as 


Figure  3.  For  the  propagating  part  of  the  spectrum,  we  change 
variables  A=oisin  0  (Equation  13). 


Figure  4.  For  the  evanescent  part  of  the  spectrum,  we  change  vari 
ables  c2  =  A2  -  of  (Equation  14). 


the  manes  cent  part.  For  the  propagating  part,  we 
change  variables  A  =  ft)  sin  0(see  Figure  3  for  the 
resulting  equation).  For  the  evanescent  part,  we 
change  variables  cr  =  X2  -  or  (see  Figure  4  for 
the  resulting  equation). 

In  Equation  12,  as  co— )  0,  the  propagating  part 
disappears,  leaving  only  the  evanescent  spectrum. 
This  is  the  integral  representation  for  Hr  used  in 


2D  Founer  transform 

In  2D,  the  analog  of  Equation  1 2  (see  the  main  text)  is 

tf0(ftr)  =  ±r  L  dX 

71  dco2-X2 

9 

(23) 

which  is  valid  for  y> 0. 

The  propagating  part,  as  above,  covers  the  range  |A|  < 
the  change  of  variables  A  =  <ycos0yields 

ft).  Using 

WQ(ar))  =  1 

y  7TJ0 

(24) 

For  the  evanescent  part,  we  make  the  change  of  variables  o2  = 

A2  -  to2,  so  that 

1  r-  ~crv  iVo^+CiT  a: 

(H0(OJr))n<lnetcrm  =_f  1— f  - da 

(25) 

July-September  1 998 


35 


Discretization 


the  FMM- — hence  our  assertion 
that  we  have  a  seamless  transition 
to  the  zero  frequency  case. 

The  next  problem  we  face  is  that 
of  discretization.  The  integrand 
for  the  propagating  part  is  smooth, 
and  we  achieve  high-order  accu¬ 
racy  via  Gaussian  quadrature  in  the 
6  direction  and  the  trapezoidal  rule 
in  the  a  direction.  The  evanescent 
part  is  more  complicated.  The  in¬ 
ner  integral  with  respect  to  a,  is 
easily  handled  by  the  trapezoidal 
rule  (which  achieves  spectral  accu¬ 
racy  for  periodic  functions),  but 
the  outer  integral  requires  more 
care.  We  use  generalized  Gaussian 
quadrature  rules,1'  designed  with 
the  geometry  of  the  interaction  list 
in  mind.  We  present  our  analysis 
in  the  “Discretization”  sidebar. 

Incorporation  into  LF-FMM 

Consider  now*  the  interaction 
list  for  a  box  B  in  the  context  of  a 
fast  multipole  code,  for  which  we 
need  1 89p4  operations  with  the 
naive  multipole-to-local  transla¬ 
tion  operator,  and  189 p>  opera¬ 
tions  using  rotation  matrices.10 
Using  the  analysis  outlined  in  the 
“Discretization”  sidebar,  w'e  can 
generate  six  outgoing  exponential 
expansions  at  a  cost  of  6p}  work 
and  translate  them  all  at  a  cost  of 
189 p1  w'ork.  Once  a  box  has  re¬ 
ceived  the  incoming  exponential 
expansions  from  all  directions,  it 
can  convert  them  to  a  single  local 
expansion,  using  an  additional  6p* 
operations.  Thus,  the  total  w'ork 
scales  like  \2p'  +  1 89p“  operations 
per  box.  Further  symmetry  con¬ 
siderations  reduce  this  to  6p3  + 
40 p2  operations.10 


Because  of  the  restriction  that  z  >  0,  we  assume,  for  the  moment, 
that  a  source  Q  =  (x0/  y0,  z0)  is  contained  in  a  box  B  and  that  a  target 
P =  (*,  y,  z)  lies  in  a  box  C  e  +  z  -  list(B).  To  fix  spatial  scales,  we  as¬ 
sume  that  B  and  C  have  unit  volume  and  that  they  are  separated  in 
the  z-direction  by  one  or  two  unit  distances.  We  then  have  the  fol¬ 
lowing  result.1 

Lemma  1:  plane  wrave  representation 

Let  rPQ  denote  the  distance  from  QeB  to  PeCs+z-  iist(B),  and 
let  {0j,  ...,6N}  and  {vh  ... tyN }  be  the  nodes  and  weights  for  N-point 
Gauss-Legendre  quadrature  on  the  interval  [0,k/2).  Then  there  exist 
weights /i 7/ ...,  ^  nodes  oh  ...,ov  and  integers  M(7),  ...,  M(s),  so 
that 

e  PQ  _  Qy y'  sin 6k  ^<u[-cose* i h- sin 6k [i *-.tp  >cos a, y~\0  (Sin cr ,  j 


*- ' cos  a: +( V-  v0  >  sin  dj  ] 


<  £ 


(15) 


for  0  <  o)  rPQ  <  7 0 ,  where  a-{  =  2n  The  total  number  of  expo¬ 
nentials  required,  which  we  denote  by  Sexfy  satisfies 

s 

SexP  =  N2  +  X  =  0(log;f ) . 

*=i 


Norman  Yarvin  and  Vladimir  Rokhlin  supply  us  with  the  weights 
and  nodes  /i,  and  cr,for  the  evanescent  modes.2  For  six-digit 
accuracy,  the  total  number  of  modes  we  require  is  approximately 
600—1 50  for  the  propagating  spectrum  and  450  for  the  evanescent 
spectrum.  Ten-digit  accuracy  requires  1 ,500  modes— 300  for  the 
propagating  spectrum  and  1,200  for  the  evanescent  spectrum.  (The 
FMM  for  the  Laplace  equation  requires  280  modes  at  six-digit  accu¬ 
racy  and  900  modes  at  1 0-digit  accuracy.) 

Corollary  1 

Let  B  be  a  box  of  unit  volume  centered  at  the  origin  containing  L 
sources  of  strengths  { q ),  I  =  7 located  at  the  points  {Q}  =  (xt  yu 
zj),  I  =  Then  for  any  P  contained  in  +z  -  list(B),  the  potential 

&(P)  satisfies 

I  *  N 

<j)(/5j_£y  £  y  Wp(k j)e-i0)cos9k'e,Q)sme*cosajxe‘(l>s'I'Qis™<Xjy 


M(k\ 


“S  X  wE(Kj)€'akZe^a~k+<I)1  C0&aJxe‘^i+a>2  s 

*=i  j= i 


<  Ae 


(16) 


ignificant  implementation 
work  remains,  including 
the  coupling  of  this  LF-FMM  with  an 
HF-FMM,  once  the  dimensions  of  a  box 
are  on  the  order  of  a  wavelength.  Current  FIF- 
FALM  implementations  have  been  able  to  inves¬ 
tigate  structures  that  are  many  wavelengths 


across,  but  only  those  with  smooth  surfaces.  A 
hybrid  code  will  be  able  to  include  subwave¬ 
length  mesh  refinement  and  will  greatly  enhance 
the  range  of  future  simulation  efforts.  ♦ 
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where /i  =  2>,.,  If/I, 
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Multipole  Rep. 
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Local  Rep. 
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Corollary  2:  Diagonal  translation 

Let  B  be  a  box  of  unit  volume  centered  at  the  origin 
containing  /V  charges  of  strengths  {g*  /  =  7,  £},  located  at 

the  points  {Q,  =  (**,  yj,  z<),  =  7, ...,  L}  and  let  C  be  a  box  in  +  z- 
//st(B)  centered  at  (x*  y*  zj.  For  B  e  Q  let  the  potential  <P(P) 
be  approximated  by  the  exponential  expansion  centered  at 
the  origin 

yv  n 

*=l  M 


s  M(k)  — 

■  '£l'£wE(kJ)e-c':e"la' 

A=l  j=i 


+Q}~  <cosa.x+sina.  v) 


=  co'y  Vp(k7  j)€~mcosek(z-z<)eio>^nek(co&aj{x-xe H-sing;(y-3 yc» 

*=1  7=1 

j  Wfil  ^ - 
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*=1  ;=1 

+«*>  (20) 


VP(k*j)  =  W^(Jfc  _1G,cos^^e,a>stn6* cosa7xc^iG)<Lihet cosa;v,. 


Exponential  Rep. 


Exponential  Rep. 


Figure  A.  In  the  new  FMM,  we  can  replace  a  large 
number  of  multipoie-to-local  translations — costing  0(p 3) 
or  0(p4)  work — with  a  large  number  of  exponential 
translations,  costing  Oip2)  work. 


In  an  actual  FMM  implementation,  we  will  be  given  the 
multipole  expansion  for  a  box  B  rather  than  the  source  distri¬ 
bution  itself,  so  we  will  need  to  convert  it  to  an  exponential 
expansion.  Moreover,  after  translating  an  exponential  expan¬ 
sion,  we  must  convert  it  to  a  local  harmonic  expansion  of  the 
form  (setf  Equation  1 0  in  the  main  text).  The  formulae  are 
rather  complex,  and  we  avoid  going  into  detail.2  Here,  we 
simply  observe  that  0(p3)  =  0(log3  e)  work  is  required  for 
each  step. 

Up  to  this  point,  we  have  considered  only  the  exponential 
expansion  needed  for  the  +z  list.  To  obtain  expansions  appro¬ 
priate  for  each  of  the  other  five  lists,  we  simply  rotate  the  co¬ 
ordinate  system  so  that  the  z  axis  points  in  the  desired  direc¬ 
tion.  The  cost  of  rotation  also  scales  as  0(p3). 


VE(k}j)  =  WE(kJ)e~ak:<ei^+0)'  **«,*<  eWt+»2 

Equations  21  and  22  are,  in  some  sense,  the  centerpiece  of 
the  new  scheme.  They  show  that  p2  degrees  of  freedom  de¬ 
scribing  the  far  field  due  to  sources  in  a  box  B  can  be  trans¬ 
mitted  to  a  box  C  in  its  interaction  list  using  p2  operations.  In 
other  words,  in  a  plane-wave  expansion,  translation  is  equiva¬ 
lent  to  multiplication  (see  Figure  A). 
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A  Scalable  Multilevel  Helmholtz  FMM  for  the  Origin  2000* 


Mark  A.  Stalzer* 


Abstract 

Presented  is  a  parallel  algorithm  based  on  the  multilevel  fast  multipole  method 
(FMM)  for  the  Helmholtz  equation.  This  variant  of  the  FMM  is  useful  for  electro¬ 
magnetic  scattering  calculations.  The  algorithm  was  implemented  on  an  SGI  Origin 
2000  using  a  threaded  approach  without  explicit  message  passing.  To  achieve  good 
scalability,  steps  in  the  FMM  that  intrinsically  require  inter- processor  communications 
(applying  far  field  translation  operators)  were  modified  to  improve  cache  performance 
and  minimize  communications  costs. 


1  Introduction 

This  paper  presents  a  scalable  parallel  version  of  the  multilevel  fast  multipole  method 
(FMM)  for  the  Helmholtz  equation:  (V2  +  /c2)'P  =  p.  This  variant  of  the  FMM  is  useful 
for  computing  scattering  cross  sections  and  antenna  radiation  patterns[2,  3,  5,  6].  This  is 
in  contrast  to  the  FMM  for  the  Laplace  equation,  V2'!'  =  p,  which  is  applicable  to  the  N- 
body  problem.  A  substantial  amount  of  work  has  been  done  on  parallelizing  the  (multilevel) 
Laplace  FMM  [4,  8,  10],  and  single-level  Helmholtz  FMM  [7,  9],  The  emphasis  here  is  on  a 
scalable  parallel  multilevel  Helmholtz  FMM. 

This  paper  is  organized  as  follows.  In  the  next  section,  the  basics  of  the  multilevel 
Helmholtz  FMM  are  reviewed.  In  Section  3,  the  computation  model  is  presented  followed 
by  the  details  of  the  parallel  FMM  implementation  in  Section  4.  Scalability  results  are 
given  in  Section  5  followed  by  some  concluding  remarks. 

2  Fast  Multipole  Method 

A  method  of  frequent  choice  for  computing  scattering  cross  sections  and  radiation  patterns  is 
to  solve  a  matrix  equation,  Z  •/  =  V,  derived  from  the  discretization  of  an  integral  equation. 
The  number  of  unknowns  N  required  for  accurate  modeling  of  such  problems  can  be  very 
large,  which  severely  limits  problem  size.  The  system  can  be  solved  by  factoring  the  dense 
matrix  Z  (an  0(iV3)  operation),  or  by  using  an  iterative  technique  which  requires  0( Ar2) 
operations  per  iteration.  The  0(Ar2)  operation  in  iterative  solvers  is  the  multiplication  of 
an  approximation  /  by  the  impedance  matrix  Z.  In  contrast,  the  FMM  works  by  recursively 
decomposing  Z  into  sparse  components  that  can  be  applied  in  0(N  log2  N)  time. 

The  basic  approach  given  here  follows  the  paper  by  Gyure  and  Stalzer[5].  Consider 
two  well-separated  spheres  of  radius  R\  and  R^,  each  containing  a  collection  of  Helmholtz 
sources.  The  field  due  to  an  individual  source  is  given  by 

pik0r 

(1)  d>(r)  =  G(  r)  = 

k0r 
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where  r  is  relative  to  the  source  and  ko  is  the  free  space  wavenumber.  (Given  a  vector  r.  r 
is  its  magnitude  and  f  is  the  corresponding  unit  vector.)  We  want  to  quickly  evaluate  the 
field  generated  by  all  the  sources  in  Ri  at  every  source  in  7?o-  This  field  can  be  written  as 
a  multipole  expansion  valid  outside  of  Ri  as 

(2)  ^(r)  =  Y,Pimhi(kr)Ylm(e.0) 

lm 

where  r, 0,  and  (p  are  relative  to  a  coordinate  system  centered  in  R\,h[{kr)  are  spherical 
Hankel  functions  of  the  first  kind,  and  o)  are  normalized  spherical  harmonics.  We'll 

refer  to  this  expansion  as  an  h-expansion.  Similarly,  we  can  write  an  expression  for  the 
field  valid  inside  R2  : 

(3)  ?(r)  =  Yi  airnji(kr)Y[m(6.  <p) 

lm 

where  the  coordinate  system  is  now  centered  in  Ro,  and  ji(kr)  are  spherical  Bessel  functions. 
We’ll  refer  to  this  expansion  as  a  j-expansion.  For  the  moment,  we  consider  both  of  these 
to  be  infinite  sums.  The  FMM  then  rests  on  three  observations: 

•  The  origin  of  an  h-expansion  can  be  shifted  arbitrarily  inside  Ri ,  and  a  new  set  of 
coefficients,  /3/m,  can  be  computed  for  this  new  expansion.  The  same  holds  for  shifting 
a  j-expansion  arbitrarily  to  a  new  origin  inside  of  R2 ,  which  results  in  a  new  set  of 
coefficients,  d/m. 

•  An  h-expansion  valid  outside  of  R\  can  be  translated  and  converted  into  a  j-expansion 
valid  inside  R2,  resulting  in  a  new  set  of  coefficients  for  the  j-expansion, 

•  Most  crucial,  these  shifts  and  translations  can  be  done  efficiently  by  transforming  the 
coefficients  into  a  basis  in  which  both  operators  are  diagonal. 

The  far  field  transform  of  an  arbitrary  function  f(k)  is 

(4)  f(fc)  =  YilY^(k)flm 

lm 

and  the  inverse  transform  is  given  by 

(5)  fim  =  j  dki-'Yfm{k)f(k) 

where  k  is  a  unit  vector  represented  by  polar  and  azimuthal  angular  components:  ( kg ,  k^). 

It  is  in  this  k-basis  that  the  shift  and  translation  operators  are  diagonal.  An  h-expansion 
in  its  far-field  basis  is  shifted  from  a  point  x  to  another  point  x'  both  inside  of  Ri  by 

(6)  p(k)  =  \(k,x' -x)(3(k) 
where  A  is  given  by 

(7)  X(k,x’  —  x)  =  eikok{x’-x) 

The  same  shift  operator  A  also  applies  to  j-expansions.  It  represents  a  “local”  shift  in  the 
group  center,  retaining  the  exterior  or  interior  expansion. 

The  translation  of  an  h-expansion  into  a  j-expansion  is  through  the  translation  operator 
H,  which,  in  the  far-field  basis  is 

(8)  /u,(k,x'  -x)  =  yV(2/  +  l)hi(k0\x'  -  x\)Pt(k  •  (x7  -  x)/|x'  -  x|) 

l 
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where  the  Pi  are  Legendre  polynomials. 

In  practice  the  expansions  are  truncated  to  a  finite  number  of  terms  L  depending  on  the 
group  size  and  desired  accuracy.  The  mathematical  validity  of  this  truncation  is  addressed 
by  Rokhlin[6]  but  it  is  related  to  the  fact  that  these  series  are  asymptotic  and  are.  therefore, 
of  controllable  accuracy.  Empirically,  it  has  been  determined  that  the  number  of  terms  L 
needed  in  the  expansions  for  a  region  of  diameter  D  is  [2] 

(9)  L  =  k0D  +  ^\og(k0D  +  n) 
where  d  is  the  desired  number  of  digits. 

The  above  expressions  for  the  translation  operators,  together  with  the  far  field 
transform,  are  the  basic  tools  used  to  construct  a  multilevel  FMM  algorithm.  Clearly 
the  field  caused  by  a  collection  of  sources  inside  an  arbitrary  group  G\  can  be  evaluated  at 
any  point  inside  a  second  group  G2  by  converting  the  exterior  h-expansion,  valid  outside 
G i,  to  an  interior  j-expansion  which  is  valid  inside  G2.  Also,  we  can  calculate  the  field 
at  that  point  caused  by  the  sources  in  G\  by  computing  doo,  the  leading  term  in  the  j- 
expansion.  No  other  terms  contribute,  because  the  expansion  is  already  centered  at  the 
field  point  where  r  =  0  and  all  the  terms  ji( 0)  are  zero  except  for  jo  which  is  one.  Thus, 
we  can  evaluate  the  field  directly  through  the  far-field  transform  as 

(10)  d»(0)  =  d00  =  -i=  \  dka(k). 

V47 r  J 

The  abcissae  k  =  (kg,  kQ)  of  the  numerical  quadrature  rule  used  to  compute  this  integral 
are  selected  so  that  it  can  be  performed  exactly.  One  choice  is  to  use  a  trapezoidal  rule  of 
2 L  points  in  the  d  direction  and  an  L  point  Gauss-Legendre  rule  in  the  9  direction.  This 
discretization  of  the  k  basis  is  used  throughout  the  FMM. 

The  multilevel  Helmholtz  FMM  works  in  fundamentally  the  same  way  as  the  Laplace 
FMM  in  that  it  combines  expansions  valid  inside  the  original  groups  to  form  expansions 
valid  inside  correspondingly  larger  groups  with  bigger  group  diameters.  This  recursive 
regrouping  results  in  a  tree-like  structure  that  has  groups  of  different  sizes  at  different 
levels  of  the  tree.  The  h-expansions  from  neighboring  groups  are  shifted  and  combined  into 
a  single  h-expansion  representing  a  larger  group  when  going  up  the  tree,  and  j-expansions 
in  a  large  group  are  converted  to  smaller  groups  going  down  the  tree.  The  details  of  this 
process  are  given  in  the  next  section. 

There  is,  however,  an  important  mathematical  detail.  When  going  up  the  tree,  it  is 
necessary  to  interpolate  the  far  field  representation  of  a  group  at  one  level  onto  the  denser 
(k  more  closely  spaced)  basis  of  the  group  one  level  higher.  Similarly,  when  going  down  the 
tree,  it  is  necessary  to  convert  to  a  sparser  basis  in  a  filtering  process.  In  both  cases,  the 
code  converts  from  the  far  field  basis  to  the  multipole  coefficients  and  then  back  to  the  new 
far  field  basis  using  the  definitions  given  in  Equations  4  and  5.  The  actual  implementation 
is  in  terms  of  fast  Fourier  transforms  for  the  k^  direction,  and  fast  associated  Legendre 
transforms  for  the  ke  direction  [11],  As  a  practical  matter,  a  slow  associated  Legendre 
transform  which  is  implemented  in  terms  of  matrix  multiplication  can  be  used  on  rather 
large  problems  because  of  the  small  prefactor  in  its  time  complexity  relative  to  the  fast 
transform.  However,  fetching  the  transform  matrices  from  memory  causes  some  scalability 
problems  which  are  addressed  in  Section  4.2.  The  details  of  the  filtering  and  interpolation 
processes  are  given  in  [5] . 
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3  Computation  Model 

The  parallel  FMM  is  implemented  using  threads  assuming  a  cache-coherent  distributed 
shared  memory  mechanism  such  as  that  on  the  Origin  2000.  The  02000  is  constructed  as 
a  collection  of  nodes  interconnected  by  a  hypercube.  A  node  consists  of  two  processors, 
each  with  two  levels  of  cache,  and  a  local  memory  that  is  shared  by  the  processors  directly 
and  by  all  other  nodes  via  the  network.  To  achieve  good  scalability,  it  is  essential  that  the 
caches  be  used  effectively  and  that  crucial  data  structures  are  placed  in  memories  close  to 
the  processors  that  will  use  the  structures.  This  placement  is  treated  in  Section  4.2 

The  implementation  rests  on  two  abstractions:  a  Barrier  and  a  Counter.  These 
abstractions  are  implemented  in  terms  of  IRIX  threads  (SPROCS)  for  the  02000  or  POSIX 
threads  for  other  platforms.  A  Barrier  B  has  the  expected  semantics:  when  a  thread  calls 
enter (B),  it  returns  only  after  all  other  threads  have  called  enter. 

A  Counter  is  a  thread-safe  counter  that  has  two  primary  routines:  reset(C)  and 
next(C,p)  (increment),  where  C  is  a  Counter  and  p  is  a  thread  number.  Counter  is  used  to 
loop  over  groups  at  each  level  in  the  FMM.  The  reset  routine  sets  the  counter  to  zero  and 
acts  as  a  barrier.  The  next  method  returns  the  next  value  of  the  counter.  The  basic  usage 
is  that  all  the  threads  initialize  the  counter  to  zero  with  reset  and  then  enter  a  loop  getting 
the  next  value  of  the  counter  until  all  the  groups  at  a  given  level  have  been  processed. 

There  is  one  additional  detail.  At  a  given  level  in  the  FMM  grouping  there  are  a  certain 
number  of  groups  Mi.  Assuming  P  threads,  next  for  a  thread  p  first  returns  values  in  the 
range  Mip/ P . . .  Mi {p+  l)/P  -  1 .  These  are  the  thread’s  groups  for  the  level.  Once  a  thread 
is  done  processing  its  groups,  next  begins  to  return  values  corresponding  to  groups  the  have 
not  yet  been  processed  by  the  other  threads.  When  all  work  is  complete,  next  returns  a 
value  >  Mi  and  the  computation  moves  on  to  the  next  step.  The  net  effect  is  a  sort  of 
dynamic  load  balancing.  This  is  easy  with  shared  memory,  but  difficult  to  achieve  with 
explicit  message  passing.  Two  final  Counter  routines  are  first(C,p )  which  returns  Mip/P 
and  last(C.p)  which  gives  Mi(p+  1  )/P  —  1. 

4  Parallel  FMM 

A  basic  parallel  FMM  is  presented  next  that  is  implemented  in  terms  of  the  primitives 
defined  above.  The  basic  algorithm  is  then  modified  to  improve  scalability  by  explicitly 
placing  data  structures  in  memory  and  by  ordering  the  use  of  the  translation  operators. 

4.1  Basic  Algorithm 

There  are  two  routines:  setup  which  builds  the  data  structures  necessary  for  the  FMM,  and 
apply  which  computes  the  product  Z  ■  I. 

The  setup  routine  works  as  follows.  First,  a  tree  of  groups  is  constructed.  The  lowest 
level  (Z  =  0)  groups  contain  elementary  sources.  Each  higher  level  group  at  some  level  Z, 
contains  up  to  eight  level  Z  —  1  subgroups  of  one  half  the  size.  However,  since  a  surface 
is  being  discretized,  the  typical  number  of  subgroups  is  about  four.  The  top  of  the  tree 
consists  of  a  single  group  which  contains  the  entire  scatterer.  The  quantity  H  is  the  height 
of  the  tree  in  levels,  so  that  the  topmost  level  is  H  —  1.  Let  groups[V)  be  the  set  of  groups 
at  level  Z,  and  Mi  be  the  number  of  elements  in  this  set.  Denote  the  parent  of  a  group  m 
by  mp.  Finally,  let  be  the  number  of  terms  in  the  expansion  at  level  Z  as  determined  by 
Equation  9. 

For  each  group  m  two  sets  (lists)  are  constructed,  nearby(m )  and  /ar(m),  based  on 
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the  following  conditions: 

(11)  m!  e  nearby (m)  iff  koXmm'  <  L/, 

(12)  m!  €  far(m )  iff  m!  £  nearby(m )  and  <  I/+1 

where  m  and  m'  are  members  of  groups(l),  and  Xmm/  is  the  vector  between  the  group 
centers  X  and  X'.  In  other  words,  a  group  is  in  the  nearby  list  of  m  if  it  is  too  close  to  use 
the  translation  operators  at  that  level.  Otherwise,  it  is  in  the  far  list  as  long  as  the  parents 
of  m  and  m'  are  too  close  to  use  their  translation  operators.  Interactions  between  sources 
are  accounted  for  at  the  highest  possible  level. 

The  construction  of  the  tree  is  fast  and  is  done  by  the  main  thread.  The  main  thread 
then  creates  P  apply  threads  where  P  is  typically  set  to  the  number  of  processors  available. 
These  threads  perform  memory  allocation  and  construct  the  translation  operators  p  as 
described  in  Section  4.2.  Once  the  apply  threads  have  finished  initializing,  the  setup  is 
complete,  and  the  threads  wait  on  a  Barrier. 

When  the  iterative  solver  needs  to  compute  B  =  Z  ■  I  (i.e.  apply  the  operator),  it 
releases  the  threads  from  the  Barrier  and  they  execute  the  steps  listed  below.  The  steps 
are  written  in  terms  of  top  level  loops  over  groups  using  Counters.  This  naturally  splits 
the  work  over  threads  and,  hence,  processors.  This  approach  scales  properly  given  good 
placement  of  data  structures  and  care  in  applying  translation  operators.  These  issues  are 
treated  in  more  detail  in  the  next  sections.  In  what  follows,  the  (3(k)  quantities  are  denoted 
by  s  and  the  a(k)  quantities  are  denoted  by  g.  Loops  are  written  in  a  C-style  as  for 
C initialization ;  test ;  update ),  or  as  for  (i  €  set )  where  i  is  understood  to  sequentially  take 
on  all  values  of  the  set  or  range.  Each  thread  p  executes  the  following  to  carry  out  the 
FMM  apply: 

Local-to-Far:  The  far  field  basis  of  each  /  =  0  group  is  constructed  from  its  sources.  There 
is  no  need  to  compute  the  multipole  coefficients  since  it  is  a  simple  matter  to  compute 
the  far-held  directly  from  the  sources. 

for  ( reset(C0);m  <  M0;m  =  next(C0,p)) 
for  (k  €  0 . .  .Ko  -  1) 

Smk  ^^a£sources(m)  )lma 

Note  that  at  every  level  in  the  tree,  there  is  a  Counter  Q  controlling  the  iterations 
at  that  level.  The  number  of  far  held  directions  at  a  level  is  Ki  —  2 Lf  using  the 
quadrature  rule  described  Section  2.  It  should  be  clear  that  each  value  of  an  index 
k  represents  some  k  =  (kg,  k$)  in  the  discretized  far  held  basis  for  that  level.  The 
sources  of  a  l  —  0  group  m  are  sources(m),  and  the  location  of  a  source  a  is  xa. 

Uptree:  The  far  helds  due  to  each  subgroup  of  a  group  are  interpolated  and  shifted  to  the 
group’s  center  and  accumulated  to  form  the  far  held  basis  of  the  parent  group. 

for  (l  £  1 . . .  H  —  1) 

for  (reset(Ci); m<Ml\m  =  next(Q,p)) 
for  (m'  €  subgroups(m)) 
sm>  =  interpolate^  m') 
for  (k€0...Ki-l) 

Smk  =  Smk  .  Xm  Xr„  jSrn1  k 
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Translate:  For  each  group  m,  the  far  field  of  each  far  away  group  is  translated  to  m. 
converted  to  a  j-expansion.  and  accumulated.  This  gives  the  field  due  to  all  groups 
far  from  m  as  a  j-expansion  valid  inside  of  m. 

for  (/  £  0. . . H  —  1) 

for  (m  e  first(Ci.p) . .  .last(Ci.p)) 
for  (m'  £  far(m)) 
for  (k  £  0.. Ki  -  1) 

9rnk  Qmk  “f“  p(k .  Am  Xm')Sm' k 

Downtree:  The  j-expansions  are  walked  down  the  tree  in  a  way  analogous  to  Uptree. 
The  code  works  downward  from  level  H  —  1.  shifting  the  field  gm  of  group  m  to  its 
subgroups  and  then  filtering  (instead  of  interpolating).  The  parallel  structure  is  just 
like  Uptree. 

Far-to-Local:  At  the  bottom  of  the  tree,  the  j-expansions  are  used  to  evaluate  the  field  at 
each  source  due  to  all  far  away  sources.  The  procedure  is  the  same  as  the  Local-to-Far 
step  except  that  k  — +  —  k.  At  the  end  of  this  step,  the  result  ( B )  has  been  computed 
for  all  far  away  interactions. 

Direct:  To  account  for  interactions  between  groups  that  are  too  close  to  each  other  to  use 
the  FMM,  the  Green  function  is  used  directly: 

for  ( reset(Co):m  <  M0 :  m  =  next(C0,p )) 
for  ( m '  £  near(m )) 
for  (a  £  sources(m )) 

^ma  Pjna  “1“  ^sources(m')  G(jka  ^ a r 

enter  (apply. gate) 

G  is  the  Helmholtz  kernel  as  defined  in  Equation  1.  The  final  step  is  for  all  of 
the  treads  to  enter  a  barrier.  This  ensures  that  the  calculation  is  complete  before 
returning  to  the  main  thread. 

This  description  of  the  parallel  algorithm  is  very  similar  to  its  sequential  counterpart. 
The  only  complications  are  operations  on  the  Counters,  which  look  like  regular  loops, 
and  the  Barriers.  These  similarities  between  the  parallel  and  sequential  algorithm  make 
implementation  and  maintainability  easier. 

This  algorithm  is  for  scalar  (acoustic  with  Dirichlet  boundary  conditions)  scattering. 
For  the  vector  case  (electromagnetic),  the  work  doubles  because  two  field  components  must 
be  kept  for  each  source  but  the  algorithm  is  otherwise  straightforward.  The  results  in 
Section  5  are  for  electromagnetic  scattering. 

4.2  Memory  Allocation  and  Placement 

To  assist  in  placing  data  structures  in  memory,  IRIX  provides  an  interface  called  dplace. 
During  initialization,  dplace  is  instructed  to  reserve  P/2  local  memories  in  a  cube 
architecture.  When  each  thread  p  is  created  during  the  FMM  setup  phase,  it  instructs 
dplace  to  associate  itself  with  memory  p/2.  The  default  memory  allocation  policy  in  IRIX 
is  “first-touch,”  meaning  that  when  a  thread  allocates  memory,  IRIX  attempts  to  satisfy 
the  request  on  the  node  containing  the  processor  that  is  currently  executing  the  thread. 
The  net  effect,  is  that  all  memory  allocated  by  an  apply  thread  will  be  local  assuming  that 
the  allocations  can  fit  in  its  node. 


In  what  follows,  the  phrase  that  a  node  allocates  memory,  indicates  that  one  of  the 
threads  running  on  the  node  (like  the  even  numbered  thread),  allocates  the  memory  and 
then  the  other  thread  on  the  node  aliases  the  allocation.  This  allows  certain  read-only  data 
structures  to  be  replicated  across  nodes  but  shared  by  the  threads  running  on  the  node. 

After  the  memory  model  is  set  up  using  dplace,  a  set  of  filters  for  moving  between 
the  different  levels  in  the  tree  are  allocated  on  each  node.  The  filters  at  the  lower  several 
levels  of  an  FMM  tree  are  based  on  moderate  sized  matrices.  Without  local  filters,  Uptree 
and  Downtree  do  not  scale  properly  because  there  is  a  bottleneck  when  all  processors  try 
to  fetch  the  matrices  out  of  a  single  node.  Similarly,  the  shift  operators  A  are  replicated 
in  each  node  since  there  are  at  most  eight  per  level.  (Except  for  the  l  =  0  shift  operators, 
which  are  computed  as  needed.) 

Each  thread  allocates  the  field  variables  s  and  g  for  its  groups  as  well  as  local  thread 
temporary  storage  (and  working  storage  for  the  FTT  routines  used  by  the  filters).  In 
addition,  every  thread  allocates  and  computes  its  share  of  translation  operators  (g)  that 
are  used  by  all  threads.  Replication  of  the  translation  operators  is  unfeasible  due  to  their 
size.  This  will  have  implications  which  are  treated  in  Section  4.3. 

The  end  result  is  that  each  node  contains  filters  (and  interpolators),  shift  operators, 
group  field  variables  s  and  g,  thread  local  storage,  and  a  share  of  the  translation  operators. 
All  of  the  other  data  structures  required  for  the  FMM,  and  there  are  many,  are  allocated 
without  concern  for  placement  because  they  are  not  performance  critical. 

4.3  Application  of  Translation  Operators 

Applying  the  translation  operators  in  a  scalable  way  is  more  problematic.  Here  the  fields 
of  all  far  away  groups  from  a  particular  group  are  translated,  converted  to  a  j-expansion 
valid  inside  the  group,  and  summed.  It  is  likely  that  the  field  of  a  far  away  group  will  be 
in  a  remote  node  which  makes  this  step  highly  cache  sensitive.  If  naively  implemented, 
the  application  of  translation  operators  scales  very  poorly.  Developing  a  method  so  that 
remote  fields  (fields  of  far  away  groups  that  are  stored  in  remote  nodes)  are  brought  into 
the  local  cache  and  reused  several  times  is  essential  to  the  overall  scaling  of  the  algorithm. 

A  simple  observation  is  the  key  to  scalability.  Consider  several  groups  that  are 
neighbors,  i.e.  close  together  in  space.  If  one  of  these  groups  needs  a  particular  remote 
field,  it  is  likely  that  its  neighbors  will  also  need  the  field  since  the  distances  between  the 
neighbors  and  the  remote  group  are  roughly  the  same.  The  essential  idea  is  to  translate 
the  remote  field  to  all  of  the  neighbors  in  succession  which  brings  the  field  into  the  cache 
and  reuses  it  many  times. 

To  implement  this  idea,  we  need  a  ordering  (numbering)  of  the  groups  for  each  level 
in  the  tree  that  keeps  groups  that  are  close  together  in  space  also  close  together  in  the 
ordering.  Such  an  ordering  is  given  by  a  breadth-first  traversal  of  the  group  tree.  A 
breadth-first  traversal  at  a  level  is  defined  as  follows.  For  the  top-most  level  H  -  1  the 
traversal  is  just  to  visit  the  single  top-most  group.  To  traverse  level  l  <  H  -  1,  visit  all  of 
the  groups  which  are  at  level  /  +  1  in  breadth-first  order  and  for  each  level  l  +  1  (parent) 
group  visit  each  of  its  subgroups.  Since  the  subgroups  are  contained  within  the  region  of 
the  parent,  we  get  an  ordering  that  keeps  groups  close  together  in  space.  This  ordering  is 
analogous  to  the  Morton  order  reported  in  [10]. 

One  final  issue  has  to  do  with  the  small  size  of  the  cache.  The  basic  loop  for  applying 
translation  operators  applies  all  operators  to  a  group  m  before  moving  on  to  the  next  group 
in  the  ordering.  It  must  be  done  this  way  in  order  to  keep  gm  (the  far  field  representation 
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Processors 

Time  (s) 

Speedup 

Efficiency  (%) 

1 

607.9 

1 

100 

2 

298.4 

2.0 

100 

4 

152.3 

4.0 

100 

8 

79.6 

7.6 

96 

16 

42.6 

14.3 

89 

32 

23.6 

25.9 

81 

Table  1 

Scalability  of  threaded  multilevel  FMM. 


of  the  j-expansion  for  the  group)  in  the  cache  as  well.  Caches  are  too  small,  however,  to 
keep  all  of  the  remote  fields  at  once,  defeating  the  purpose  of  the  ordering.  The  solution  is 
to  translate  only  a  piece  of  the  far  field  representation  of  a  far  away  group  at  a  time.  The 
specific  size  of  the  pieces  depends  primarily  on  the  cache  size,  but  limiting  the  piece  size 
kps  to  about  kps  —  80  double  precision  complex  numbers  has  worked  well  in  practice  on 
several  machines.  So,  at  a  given  level,  the  ordering  is  traversed  translating  a  piece  of  the 
far  field  representation  for  each  group.  At  the  end  of  the  ordering,  the  process  moves  on 
to  the  next  piece  of  the  representation.  This  is  repeated  until  all  the  far  fields  have  been 
translated  at  that  level.  The  code  then  continues  onto  the  next  level.  The  algorithm  is  very 
cache  friendly. 

In  detail,  translate  is  implemented  as  follows: 

for  (/  €  0. . .  H  —  1) 

for  (kk  =  0;  kk  <  Kp  kk  =  kk  +  kps ) 
ksize  =  min(kslice ,  A';  —  kk) 
for  (m  €  first(Ci,p) . .  .last(Ci.p)) 
for  (m'  €  far(m )) 

for  (k  £  kk  . . .  kk  4-  ksize  —  1) 

Qrn k  <J,n k  “1“  Fmm' k^m' k 

where  Tmm’k  =  —  Xm/).  These  are  the  quantities  that  are  precomputed  in  the  setup 

phase.  The  effectiveness  of  the  new  implementation  is  demonstrated  in  the  next  section. 

5  Results 

The  scaling  of  the  threaded  multilevel  FMM  apply  algorithm  is  given  in  Table  1.  Listed  is 
the  apply  time  in  seconds  versus  the  number  of  processors  for  a  16A  radius  sphere  discretized 
by  153,600  unknowns.  Also  listed  is  the  speedup  Sp  =  Tx/Tp  where  Tp  is  the  apply  time 
for  p  processors,  and  the  parallel  efficiency  1005),//;.  The  scaling  is  very  good,  with  32 
processors  achieving  81%  efficiency. 

The  effect  of  the  technique  used  to  apply  the  translation  operators  is  shown  in  Table  2 
for  the  same  problem.  The  table  shows  the  total  time  spent  by  all  processors  in  the  Translate 
step.  Using  the  technique  described  in  Section  4.3  ,  the  effort  to  apply  the  operators  grows 
by  29.3%  as  the  number  of  processors  increases  from  1  to  32  (the  elapsed  time  is  82.5s  for 
1  processor  and  3.33s  for  32  processors).  In  contrast,  if  the  operators  are  applied  naively 
without  ordering  the  groups  or  dividing  up  the  far  field  directions  for  cache  efficiency,  the 
effort  to  apply  the  operators  grows  173%  and  begins  to  take  a  substantial  fraction  of  the 
total  apply  time. 

The  scaling  of  the  apply  can  be  further  improved  by  additional  tuning  in  Uptree  and 
Downtree.  The  main  problem  is  that  static  data  for  the  filters  is  not  replicated  across  the 
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Processors 

i 

2 

4 

8 

16 

32 

Scalable  (s) 

82.5 

81.7 

83.7 

88.2 

94.5 

106.7 

Unscalable  (s) 

99.1 

110.4 

124.5 

158.8 

204.9 

271.0 

Table  2 


Time  spent  doing  translations  versus  number  of  processors  for  scalable  and  unscalable  imple¬ 
mentations. 


nodes  which  causes  a  bottleneck  (filter  dynamic  data,  like  the  matrices,  are  replicated). 
This  can  be  improved  with  some  programming  effort. 

6  Concluding  Remarks 

The  threaded  approach  taken  here  has  some  advantages  over  explicit  message  passing. 
Often  some  of  the  interprocessor  communications  required  in  complex  parallel  codes  are  not 
performance  sensitive.  Such  communications  can  be  handled  automatically  by  the  hard-ware 
in  a  threaded  shared  memory  approach  without  burdening  the  programmer.  Making  the 
performance  sensitive  parts  work  properly,  i.e.  scale,  is  largely  an  exercise  in  tuning  the 
caches  which  must  be  done  regardless  for  good  uniprocessor  performance. 

In  addition,  there  is  a  maintainability  benefit.  As  fast  scattering  codes  gets  more 
complicated,  with  the  addition  of  support  for  complex  materials  and  subwavelength 
structures,  the  load  balancing  problem  implicit  in  message  passing  codes  will  become  very 
complex.  Parallelizing  such  codes  will  be  easier  in  a  shared  memory  environment. 

Significantly,  the  compact  size  of  the  FMM  allows  the  exploitation  of  another  form  of 
parallelism:  computing  the  scattering  from  multiple  incident  angles.  With  large  0(N2) 
operators  the  entire  machine  would  be  needed  just  to  store  the  operator.  The  FMM  is 
far  more  compact  and  can  be  replicated  several  times  in  a  supercomputer,  making  the 
multiple  angle  problem  embarrassingly  parallel.  The  same  is  true  for  design  optimization 
(parameter)  studies. 

The  parallel  FMM  presented  here  is  part  of  the  FastScat  program  for  performing 
electromagnetic  scattering  calculations.  Recently,  FastScat  computed  the  radar  cross 
section  for  both  polarizations  of  an  40A  radius  sphere  to  0.16  db  rms  accuracy  in  20.7  hours 
on  a  32  node  Origin  20001.  The  target  was  over  20, 000  square  wavelengths.  The  ability  to 
accurately  compute  the  RCS  of  such  a  large  target  is  due  to  the  FMM,  a  discretization  of 
the  integral  equation  that  is  of  high  order  [1],  and  a  scalable  parallel  implementation  of  the 
FMM. 
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Abstract 

We  describe  the  FastScat™  program  for  electromagnetic  scattering  calculations  and  its  parallel 
implementation  on  the  SGI  Origin  2000.  FastScat  recently  computed  the  radar  cross  section  of  a  sphere 
having  an  area  of  45,  239A2  to  high  accuracy  in  about  a  day.  This  is  contrasted  with  a  result  for  an  354 A2 
sphere  reported  at  Supercomputing  ‘92.  Taking  both  size  and  accuracy  into  account,  the  FastScat  result 
represents  an  improvement  in  solution  time  of  over  nine  orders  of  magnitude.  This  improvement  was 
due  to  systematically  focusing  on  several  issues  that  impact  the  scalability  of  electromagnetic  scattering 
calculations. 


1  Introduction 

This  paper  presents  the  FastScat™  program  for  efficiently  performing  frequency  domain  electromagnetic 
scattering  calculations  using  a  boundary  integral  equation  formulation  on  parallel  computers.  Typical 
applications  include  radar  cross  section  (RCS)  prediction,  the  computation  of  antenna  radiation  patterns, 
and  high-frequency  circuit  package  modeling.  FastScat  is  a  truly  scalable  code  in  that: 

•  additional  accuracy  in  a  computed  solution  can  be  achieved  at  low  cost; 

•  a  small  increase  in  problem  size  (area)  causes  only  a  modest  increase  in  computer  resources;  and 

•  the  code  shows  good  parallel  scalability. 

The  scalability  of  FastScat  allows  us  to  perform  scattering  calculations  for  very  large  objects.  As  an  example, 
FastScat  recently  computed  the  RCS  of  a  metal  sphere  having  an  area  of  45, 239A2  (radius  r  =  60A)  to  high 
accuracy  in  about  a  day.  This  is  in  contrast  to  the  result  for  an  354A2  sphere  computed  by  the  Patch  code 
running  on  the  Intel  Touchstone  Delta  reported  at  Supercomputing  ‘92[3].  Taking  both  size  and  accuracy  into 
account,  the  FastScat  result  represents  an  improvement  in  solution  time  of  over  nine  orders  of  magnitude. 

Scattering  cross  sections  and  radiation  patterns  can  be  computed  by  solving  a  matrix  equation,  Z*I  =  V, 
derived  from  the  discretization  of  an  integral  equation.  The  number  of  unknowns  N  required  for  accurate 
modeling  of  such  problems  can  be  very  large,  which  can  severely  limit  problem  size.  The  system  can  be 
solved  by  factoring  the  dense  matrix  Z  (using  0(N 3)  operations),  or  by  using  an  iterative  method  which 
requires  0(N2)  operations  per  iteration.  Each  iteration  of  an  iterative  solver  involves  the  multiplication  of 
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1 


an  approximate  solution  I  for  the  source  distribution  by  the  impedance  matrix  Z.  The  iteration  count  must 
be  controlled  to  achieve  reasonable  solution  times. 

There  are  four  important  solution  method  characteristics  required  to  achieve  scalabilty: 

•  The  method  must  be  high  order.  It  is  desirable  that  computed  solutions  converge  as  h.  the  characteristic 
scale  size  of  the  discretization,  decreases.  For  boundary  integral  solutions  to  scattering  problems,  the 
error  e  generally  scales  as  e  oc  hp,  where  p  >  1  is  the  order  of  convergence.  Most  codes  based  on 
the  Method  of  Moments,  such  as  Patch,  are  low  order:  e  oc  h 2.  In  contrast,  FastScat  is  high  order 
and  values  of  p  up  10  are  routinely  used.  High  order  convergence  allows  us  to  get  extra  accuracy 
for  minimal  additional  computational  cost.  This  is  essential  for  estimating  the  solution  error  and 
computing  scattering  from  objects  with  large  dynamic  ranges[l]. 

•  The  method  must  be  fast.  An  0(N3)  method  is  feasible  only  for  small  problems.  By  using  an 
iterative  solver  and  switching  to  the  Fast  Multipole  Method  (FMM)[2, 4,  6,  7,  9],  the  time  complexity 
can  be  reduced  to  0{Cl N  log  N)  where  C,  is  the  iteration  count.  The  FMM  constructs  a  sparse 
representation  of  Z  which  is  used  to  efficiently  compute  the  product  Z  ■  I. 

•  The  integral  equation  must  be  well  conditioned.  FastScat  uses  a  Combined  Field  Integral  Equation 
formulation  (CFIE)[9]  which  results  in  a  well  conditioned  operator  for  many  scatterers.  The  CFIE,  in 
conjunction  with  a  simple  preconditioner  and  a  conjugate  gradient  solver,  keeps  the  iteration  count  C, 
reasonable. 

•  The  implementation  must  have  good  parallel  scalability.  The  crucial  parallel  operation  in  FastScat 
is  applying  the  FMM.  All  other  computations  are  either  embarrassingly  parallel  or  are  so  cheap  that 
they  can  be  done  on  a  single  processor.  A  substantial  amount  of  work  has  been  done  on  parallelizing 
the  FMM  for  the  Laplace  equation[5,  8,  13].  For  electromagnetic  scattering,  the  Helmholtz  FMM 
is  required.  Achieving  good  parallel  scalability  with  this  variant  of  the  FMM  poses  some  additional 
challenges[10,  11]. 

Some  parts  of  this  work  have  been  previously  reported!  1,  6,  9,  11],  Here  we  show  how  all  of  the  parts  fit 
together  to  enable  the  solution  of  very  large  scattering  problems.  In  total,  we  believe  this  work  serves  as  the 
current  benchmark  for  the  state  of  the  art  in  frequency  domain  electromagnetic  scattering  calculations. 

This  paper  is  organized  into  a  section  on  each  aspect  of  scalability,  followed  by  a  results  section  and 
some  concluding  remarks. 

2  Discretizating  the  Integral  Equation 

Here  we  consider  a  prototypical  scattering  problem  —  3 d  scalar  scattering  with  Dirichlet  boundary  conditions 
to  show  how  the  linear  system  V  =  Z  •  I  is  formed.  This  will  set  the  stage  for  the  following  sections  on 
discretizations  and  the  FMM. 

A  specified  field  0(x)  on  a  surface  S  induces  an  unknown  source  distribution  cr(xf)  on  5.  This 
distribution  radiates  a  scattered  field 

^(x)  =  J  G(x-xf)cr(x')dx'  (1) 

where  the  Green  function  is 

pik0r 

G(r)  =  —  (2) 

k0  is  the  wave  number  (k0  =  2it  in  free  space  for  dimensions  in  wavelengths),  and  r  =  |x  -  x'|.  Applying 
the  Dirichlet  boundary  condition  d>(x)  +  tp(x)  -  0  for  x  on  S,  gives 

<£(x)  =  -  J  G(x  -  x/)a(x')dx',  x  on  S.  (3) 

For  a  moment,  ignore  the  singular  nature  of  G.  This  integral  can  be  evaluated  numerically  by  choosing  a 
suitable  Appoint  quadrature  rule.  Evaluating  Equation  3  at  the  ith  abscissa  of  the  quadrature  rule  gives 

N 

Vi  =  -  wjGijij 

3=1 


(4) 


where  V{  —  0(x;),  Gij  =  G(xt  -  Xj),  and  Wj  is  the  weight  of  the  j th  sample  point  (at  Xj)  of  the  quadrature 
rule.  We  want  to  solve  this  linear  system  for  the  unknown  sources  I.  From  /.  we  can  easily  compute  the 
scattered  field  at  any  place  exterior  to  S. 

Equations  equivalent  to  Equation  3  are  also  available  for  electromagnetic  scattering.  FastScat  uses  the 
Combined  Field  Integral  Equation  formulation  which  is  well  conditioned  and  immune  to  spurious  internal 
resonances[9].  Using  the  CFIE  in  conjunction  with  a  simple  preconditioner1  and  a  conjugate  gradient  type 
solver  keeps  iteration  counts  reasonable.  For  the  r  =  60A  sphere,  only  19  iterations  were  required  for  roughly 
two  digits  of  accuracy. 

3  High  Order  Discretizations 

The  quadrature  rule  used  in  Equation  4  is  selected  so  that  it  integrates  a  certain  class  T  of  functions  over  5 
exactly.  If  the  source  distribution  can  be  represented  exactly  as  an  expansion  over  T  then  the  convolution 
can  be  computed  exactly. 

In  practice,  the  source  distribution  on  an  arbitrarily-shaped  surface  can  be  well  approximated  by  dividing 
it  into  patches  and  locating  the  sample  points  on  each  patch  according  to  a  quadrature  rule  that  can  integrate 
polynomials  exactly  up  to  order  p.  In  the  case  of  quadrilaterals,  an  appropriate  rule  is  formed  from  the 
product  of  two  Gauss-Legendre  rules.  Analogous  rules  exist  for  triangles[I2].  The  overall  discretization  will 
converge  with  0(hp)  assuming  expansions  over  T  are  accurate  to  that  order. 

This  works  extremely  well  for  regular  kernels,  but  Nature  is  not  so  kind  and  the  Helmholtz  kernel 
G  behaves  poorly  as  the  points  i  and  j  become  close.  When  this  happens,  the  quadrature  rule  needs  to 
be  adjusted  to  account  for  the  singular  and  oscillatory  nature  of  G.  The  proper  adjustment  is  achieved  by 
replacing  the  discretized  Green  function  in  Equation  4  by 

C  _f  <2(xi  ~  xj)  ^  xi  is  far  from  xj 
\  Lij  otherwise 

where  the  Ltj  are  known  as  the  “local  corrections’ll].  The  definition  of  “far  from”  depends  on  the  desired 
accuracy.  In  practice  it  is  about  a  half  wavelength  for  two  digits. 

For  a  given  field  point  i ,  the  L?j  are  computed  by  solving  the  linear  system 

J2^Li3f{k)(xt  -  Xj)  =  [  G(x  i  -  x')/(fc)(x;  -  x')dx'  (6) 

j  Jd' 

for  all  the  testing  functions  in  JF.  The  region  D{  is  the  local  domain  of  the  zth  field  point.  This  region 
is  determined  by  computing  the  right  hand  side  of  Equation  6  adaptively,  on  a  patch  by  patch  basis,  and 
comparing  it  to  the  left  hand  side  quadrature.  This  procedure  proceeds  until  the  difference  is  below  some 
error  tolerance.  The  local  corrections  for  points  outside  of  D*  are  zero  so  the  linear  system  is  small.  The 
number  of  points  in  D*  may  be  different  from  the  number  of  testing  functions  in  T,  in  which  case,  singular 
value  decomposition  is  used  to  solve  the  system.  However,  it  is  often  possible  to  arrange  the  system  so  that 
the  number  of  points  and  functions  are  the  same.  This  approach  restores  the  desired  order  of  convergence, 
which  has  been  shown  on  many  scatterers. 

In  terms  of  scalable  scattering  calculations,  high  order  discretizations  allow  us  to  check  the  accuracy  of 
solutions  relatively  cheaply.  They  also  allow  us  to  often  compute  a  solution  to  a  given  accuracy  with  fewer 
unknowns. 

4  Fast  Multipole  Method 

The  FMM  computes  =  YjjwjGijIj  (Equation  4)  for  all  points  i  in  0(N  log2  N)  time.  This  is  the 
product  Z  •  I  needed  by  the  iterative  solver.  This  section  presents  the  basics  of  the  Helmholtz  FMM. 

‘The  preconditioner  is  block  diagonal  and  represents  the  inverse  of  some  FMM  group  self-interactions.  It  works  well 
for  many  scatterers,  but  does  not  remove  all  of  the  ill-conditioning  in  the  formulation.  Generalizations  to  the  CFIE  are 
currently  being  explored  and  some  look  very  promising. 
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Consider  two  well-separated  spheres  of  radius  Ri  and  R2,  each  containing  a  collection  of  Helmholtz 
sources.  We  want  to  quickly  evaluate  the  field  generated  by  all  the  sources  in  Rr  at  every  source  in  R 2.  This 
field  can  be  written  as  a  multipole  expansion  valid  outside  of  Ri  as 


iP(r)  =  Y^0imhi(kr)Yim(e}6) 

lm 


(7) 


where  r,6,  and  d>  are  relative  to  a  coordinate  system  centered  in  Ri.hi(kr)  are  spherical  Hankel  functions 
of  the  first  kind,  and  Ytm(0.6)  are  normalized  spherical  harmonics.  We  refer  to  this  expansion  as  an  h- 
expansion.  Similarly,  we  can  write  an  expression  for  the  field  valid  inside  Ro  : 


^  ^  Glmjl(kr)}  lm{0 o') 


(8) 


lm 


where  the  coordinate  system  is  now  centered  in  R2 ,  and  ji(kr)  are  spherical  Bessel  functions.  We  refer  to 
this  expansion  as  a  j-expansion.  For  the  moment,  we  consider  both  of  these  to  be  infinite  sums.  The  FMM 
then  rests  on  three  observations: 

•  The  origin  of  an  h-expansion  can  be  shifted  arbitrarily  inside  Ru  and  a  new  set  of  coefficients,  0 ,m. 
can  be  computed  for  this  new  expansion.  The  same  holds  for  shifting  a  j-expansion  arbitrarily  to  a  new 
origin  inside  of  R2,  which  results  in  a  new  set  of  coefficients,  d/m. 

•  An  h-expansion  valid  outside  of  #1  can  be  translated  and  converted  into  a  j-expansion  valid  inside  R2. 

•  Most  crucial,  these  shifts  and  translations  can  be  done  efficiently  by  transforming  the  coefficients  into 
a  basis  in  which  both  operators  are  diagonal. 

Th t  far-field  transform  of  an  arbitrary  function  f{k )  is 

f(k)  =  Y/ilYlm(k)flm  (9) 

lm 

and  the  inverse  transform  is  given  by 

fim  =  J  dkrlY?m(k)fCk)  do) 

where  k  is  a  unit  vector  represented  by  polar  and  azimuthal  angular  components  (kg,  k^). 

It  is  in  this  k-basis  that  the  shift  and  translation  operators  are  diagonal.  An  h-expansion  in  its  far-field 
basis  is  shifted  from  a  point  x  to  another  point  x'  both  inside  of  R1  by 


where  A  is  given  by 


0(k)  =  X(k,x'  -  x)0(k) 

\{k,x'  -x)  =  e^Ax'-x) 


(11) 


(12) 


The  same  shift  operator  A  also  applies  to  j-expansions.  It  represents  a  “local”  shift  in  the  group  center, 
retaining  the  exterior  or  interior  expansion. 

The  translation  of  an  h-expansion  into  a  j-expansion  is  through  the  translation  operator  n,  which,  in  the 
far-field  basis  is 


/x(k,  X'-X)  =  £  il(2l  +  l)/i,(fc0|x'  -  x|  )P,(k  ■  (x'  -  x)/|x'  -  x|) 


(13) 


where  the  Pi  are  Legendre  polynomials. 

In  practice  the  expansions  are  truncated  to  a  finite  number  of  terms  L  depending  on  the  group  size  and 
desired  accuracy.  The  mathematical  validity  of  this  truncation  is  addressed  by  Rokhlin[7]  but  it  is  related  to 
the  fact  that  these  series  are  asymptotic  and  are,  therefore,  of  controllable  accuracy.  Empirically,  it  has  been 
determined  that  the  number  of  terms  L  needed  in  the  expansions  for  a  region  of  diameter  D  is[2] 

L  =  k0D  +  log (k0D  +  n) 

1.0 


(14) 


where  d  is  the  desired  number  of  digits. 

The  above  expressions  for  the  translation  operators,  together  with  the  far-field  transform,  are  the  basic 
tools  used  to  construct  a  multilevel  FMM  algorithm.  Clearly  the  field  caused  by  a  collection  of  sources  inside 
an  arbitrary  group  Gx  can  be  evaluated  at  any  point  inside  a  second  group  G2  by  converting  the  exterior  h- 
expansion,  valid  outside  Gi,  to  an  interior  j-expansion  which  is  valid  inside  Go.  Also,  we  can  calculate  the 
field  at  that  point  caused  by  the  sources  in  Gx  by  computing  d00,  the  leading  term  in  the  j-expansion.  No 
other  terms  contribute,  because  the  expansion  is  already  centered  at  the  field  point  where  r  =  0  and  all  the 
terms  ji  (0)  are  zero  except  for  j0  which  is  one.  Thus,  we  can  evaluate  the  field  directly  through  the  far-field 
transform  as 

4)(0)  =  d0o  =  -j=  J  dka(k).  (15) 

The  abcissae  k  =  (kg,  k#)  of  the  numerical  quadrature  rule  used  to  compute  this  integral  are  selected  so  that 
it  can  be  performed  exactly.  One  choice  is  to  use  a  trapezoidal  rule  of  2 L  points  in  the  0  direction  and  an  L 
point  Gauss-Legendre  rule  in  the  6  direction.  This  discretization  of  the  k  basis  is  used  throughout  the  FMM. 

The  multilevel  Helmholtz  FMM  works  in  fundamentally  the  same  way  as  the  Laplace  FMM  in  that  it 
combines  expansions  valid  inside  the  original  groups  to  form  expansions  valid  inside  correspondingly  larger 
groups  with  bigger  group  diameters.  This  recursive  regrouping  results  in  a  tree-like  structure  that  has  groups 
of  different  sizes  at  different  levels  of  the  tree.  The  h-expansions  from  neighboring  groups  are  shifted  and 
combined  into  a  single  h-expansion  representing  a  larger  group  when  going  up  the  tree,  and  j-expansions  in 
a  large  group  are  converted  to  smaller  groups  going  down  the  tree.  The  details  of  this  process  are  given  in 
Section  5.1. 

There  is,  however,  an  important  mathematical  detail.  When  going  up  the  tree,  it  is  necessary  to 
interpolate  the  far-field  representation  of  a  group  at  one  level  onto  the  denser  (k  more  closely  spaced) 
basis  of  the  group  one  level  higher.  Similarly,  when  going  down  the  tree,  it  is  necessary  to  convert  to  a 
sparser  basis  in  a  filtering  process.  In  both  cases,  the  code  converts  from  the  far-field  basis  to  the  multipole 
coefficients  and  then  back  to  the  new  far-field  basis  using  the  definitions  given  in  Equations  9  and  10.  The 
actual  implementation  is  in  terms  of  fast  Fourier  transforms  for  the  k $  direction,  and  fast  associated  Legendre 
transforms  for  the  kg  direction[14].  As  a  practical  matter,  a  slow  associated  Legendre  transform  which  is 
implemented  in  terms  of  matrix  multiplication  can  be  used  on  rather  large  problems  because  of  the  small 
prefactor  in  its  time  complexity  relative  to  the  fast  transform.  However,  fetching  the  transform  matrices  from 
memory  causes  some  scalability  problems  which  are  addressed  in  Section  5.2.  The  details  of  the  filtering 
and  interpolation  processes  are  given  in  [6]. 

5  Parallel  Implementation 

FastScat  is  implemented  in  a  threaded  style  assuming  a  cache-coherent  distributed  shared  memory  machine. 
On  the  02000,  it  uses  IRIX  threads  (SPROCs)fll].  A  POSIX  threads  version  is  also  available.  It  order  to 
achieve  parallel  scalability,  it  is  essential  that  the  local  processor  caches  be  used  effectively  and  that  selected 
data  structures  are  replicated  to  reduce  network  contention. 

A  FastScat  run  progresses  through  three  phases:  setup,  solve,  and  RCS  computation.  The  setup  computes 
the  local  corrections  Lij ,  and  is  embarrassingly  parallel.  The  scalability  is  good  to  about  32  processors  and 
then  begins  to  fall  off  due  to  contention  over  the  discretization  data  structures.  The  RCS  computations  are 
also  easy  to  parallelize.  Perfect  scalability  in  the  setup  and  RCS  phase  are  not  presently  a  concern  since,  on 
practical  problems,  FastScat  spends  most  of  its  time  solving  for  the  surface  currents  for  various  excitations 
(“look  angles”)2. 

The  solve  phase  uses  the  iterative  solver,  preconditioner,  and  FMM.  The  preconditioner  can  be 
applied  in  parallel  easily  (backsubstitution  of  the  blocks),  and  the  iterative  solver  does  inner  products  over 
relatively  short  vectors  (at  most  a  few  million  elements)  which  can  be  done  on  a  single  processor.  Naive 
implementations  of  the  FMM,  however,  scale  very  poorly.  On  the  02000,  there  is  hardly  any  benefit  to  using 
more  than  a  few  processors.  The  remainder  of  this  section  describes  the  implementation  of  FastScat’s  parallel 
FMM. 


2The  sphere  run  spends  more  of  its  time  in 


setup  since  there  is  only  one  look  angle. 
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5.1  Parallel  FMM 

There  are  two  primary  FMM  routines:  setup  which  builds  the  data  structures,  and  apply  w  hich  computes  the 
product  Z  •  I. 

The  setup  routine  works  as  follows.  First,  a  tree  of  groups  is  constructed.  The  lowest  level  (/  =  0) 
groups  contain  elementary  sources.  Each  higher  level  group  at  some  level  L  contains  up  to  eight  level  /  -  1 
subgroups  of  one  half  the  size  in  each  linear  dimension.  However,  since  a  surface  is  being  discretized,  the 
typical  number  of  subgroups  is  about  four.  The  top  of  the  tree  consists  of  a  single  group  w'hich  contains 
the  entire  scatterer.  The  quantity  H  is  the  height  of  the  tree  in  levels,  and  the  topmost  level  is  H  —  1.  Let 
groups(l)  be  the  set  of  groups  at  level  /.  and  Mi  be  the  number  of  elements  in  this  set.  Denote  the  parent 
of  a  group  m  by  mp.  Finally,  let  Li  be  the  number  of  terms  in  the  expansion  at  level  /  as  determined  by 
Equation  14. 

For  each  group  m  two  sets  (lists)  are  constructed,  nearby(m)  and  /ar(m),  based  on  the  following 
conditions: 


m'€nearby(m )  iff  k0Xmm-  <  Lh  (16) 

m'  €  far(m)  iff  m'  $  nearby(m)  and  k0Xmpm>r  <  L;+1  (17) 

where  m  and  m'  are  members  of  groups(l),  and  Xmm-  is  the  vector  between  the  group  centers  Xm  and 
Xm'.  In  other  words,  a  group  is  in  the  nearby  list  of  m  if  it  is  too  close  to  use  the  translation  operators  at  that 
level.  Otherwise,  it  is  in  the  far  list  as  long  as  the  parents  of  m  and  m'  are  too  close  to  use  their  translation 
operators.  Interactions  between  sources  are  accounted  for  at  the  highest  possible  level. 

Once  the  tree  is  constructed,  various  quantities,  such  as  the  translation  operators  are  computed.  The 
setup  routine  is  called  only  once. 

When  the  iterative  solver  needs  to  compute  B  =  Z  ■  I ,  it  calls  the  apply  routine.  For  most  problems, 
FastScat  spends  most  of  its  time  in  apply.  Apply  is  implemented  in  terms  of  P  threads  where  P  is  the  number 
of  processors.  The  apply  steps  are  written  in  terms  of  loops  over  groups  and  it  is  a  simple  matter  to  split  these 
loops  over  the  threads.  These  loops  are  controlled  by  a  thread-safe  counter  that  has  two  primary  routines: 
reset(C)  and  next(C,p),  where  C  is  a  counter  and  p  is  a  thread  number.  The  reset  routine  sets  the  counter 
to  zero  and  acts  as  a  barrier.  The  next  routine  returns  the  next  value  of  the  counter.  The  basic  usage  is  that  all 
the  threads  initialize  the  counter  to  zero  with  reset,  and  then  enter  a  loop  getting  the  next  value  of  the  counter 
until  all  the  groups  at  a  given  level  have  been  processed.  In  addition,  there  are  two  routines  first(C,p)  and 
last(C,p)  which  together  define  a  sequence  of  groups  first(C,p) .  ..last{C,p)  that  thread  p  can  process 
efficiently  because  the  data  structures  for  the  groups  have  been  allocated  locally  (see  Section  5.2). 

To  compute  B  =  Z  •  /,  each  thread  p  does  the  following: 

Local-to-Far:  The  far-field  basis  of  each  l  =  0  group  is  constructed  from  its  sources.  There  is  no  need  to 
compute  the  multipole  coefficients  since  it  is  a  simple  matter  to  compute  the  far  field  directly  from  the 
sources. 

for  (reset(C0);m  <  A/0;m  =  next(C0,p)) 
for  (k  e  0. . .  K0  -  1) 

Smk  —  Xlaeaourcej(m)  —  Xa)/ma 

Note  that  at  every  level  in  the  tree,  there  is  a  counter  C,  controlling  the  iterations  at  that  level.  The 
number  of  far  field  directions  at  a  level  is  Kt  =  2 Lf  using  the  quadrature  rule  described  in  Section  4. 
It  should  be  clear  that  each  value  of  an  index  k  represents  some  k  =  (ke,  k#)  in  the  discretized  far  field 
basis  for  that  level.  The  sources  of  a  l  =  0  group  m  are  sources(m ),  and  the  location  of  a  source  a  is 
xa.  The  vector  s  is  simply  the  0(k)  quantities  of  Section  4. 

Uptree:  The  far  fields  due  to  each  subgroup  of  a  group  are  interpolated  and  shifted  to  the  group’s  center  and 
accumulated  to  form  the  far  field  basis  of  the  parent  group. 

for  (I  €  1...H-1) 

for  ( reset(Ci);m  <  M^m  =  next(Ct,p)) 
for  ( m'  6  subgroups(m)) 
sm/  =  interpolate{srn>) 
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for  (k  £  0 . . .  Ki  -  1) 
Srnk  =  Smk  “I-  A(/c,  Xm 


Translate:  For  each  group  m,  the  far  field  of  each  far  away  group  is  translated  to  m,  converted  to  a  j- 
expansion,  and  accumulated.  This  gives  the  field  due  to  all  groups  far  from  m  as  a  j-expansion  valid 
inside  of  m . 

for  (l  £  0 ...  H  —  l) 

for  (m  £  first(Ci,p) . .  .last(Ci,p)) 
for  (m'  €  /ar(m)) 
for  (k  £  0 ..K[  -  1) 

Qmk  Qmk  p{k^ 


The  vector  g  contains  the  a(k)  quantities. 

Downtree:  The  j-expansions  are  walked  down  the  tree  in  a  way  analogous  to  Uptree.  The  code  works 
downward  from  level  H  —  1,  shifting  the  field  gm  of  group  m  to  its  subgroups  and  then  filtering 
(instead  of  interpolating).  The  parallel  structure  is  just  like  Uptree. 

Far-to-Local:  At  the  bottom  of  the  tree,  the  j-expansions  are  used  to  evaluate  the  field  at  each  source  due  to 
all  far  away  sources.  The  procedure  is  the  same  as  the  Local-to-Far  step  except  that  k  — *  —k.  At  the 
end  of  this  step,  the  result  ( B )  has  been  computed  for  all  far  away  interactions. 

Direct:  To  account  for  interactions  between  groups  that  are  too  close  to  each  other  to  use  the  FMM,  the 
locally  corrected  kernel  (Equation  5)  is  used  directly: 

for  (reset(C0);  m  <  M0;  m  -  next(C0lp)) 
for  ( m '  £  near(m)) 

for  (a  £  sources(m)) 

Bma  -  Bma  *f  esources(m' )  ~  Xa')L' 

This  description  of  the  parallel  algorithm  is  very  similar  to  its  sequential  counterpart.  The  only  complications 
are  operations  on  the  counters,  which  look  like  regular  loops.  The  similarities  between  the  parallel  and 
sequential  algorithm  make  implementation  and  maintenance  easier. 

This  algorithm  is  for  scalar  (acoustic  with  Dirichlet  boundary  conditions)  scattering.  For  the  vector  case 
(electromagnetic),  the  work  doubles  because  two  field  components  must  be  kept  for  each  source  but  the 
algorithm  is  otherwise  straightforward.  The  results  in  Section  6  are  for  electromagnetic  scattering. 

5.2  Data  Placement 

For  most  FMM  steps,  memory  references  tend  to  be  localized  to  the  data  associated  with  a  particular  group 
and  its  subgroups.  In  order  to  make  these  references  efficient  (accesses  to  local  memory)  each  apply  thread 
V  is  assigned  a  sequence  of  groups  first[Cup) . . .  last(Ci,p)  at  each  level  1.  For  example,  if  there  are  eight 
groups  at  a  level  and  two  threads,  the  first  thread  gets  groups  1 ...  4  and  the  second  thread  gets  5 ...  8.  As 
part  of  its  initialization,  each  thread  allocates  certain  key  data  structures,  such  as  s  and  g  for  its  sequence  of 
groups.  These  allocations  will  generally  go  to  the  local  memory  since  a  first-touch  memory  allocation  policy 
is  used.  Threads  also  set  their  processor  affinities  so  that  they  are  not  moved  away  from  their  data  structures 
by  the  operating  system.  One  additional  point  is  that  counter’s  next(Ci,p)  routine  first  returns  groups  in 
thread  p’s  sequence.  Once  the  sequence  is  exhausted,  it  returns  groups  in  the  sequences  of  threads  that  are 
behind  in  the  computation.  This  acts  as  a  form  of  dynamic  load  balancingfl  1]. 

A  modest  amount  of  data  replication  is  also  required.  The  routines  interpolate  and  filter  used  by 
Uptree  and  Downtree  contain  several  moderately  sized  matrices  used  in  the  filtering  and  interpolation  process 
(Section  4).  These  must  be  replicated  a  few  times  to  reduce  network  contention  and  preserve  the  scalability 
of  Uptree  and  Downtree.  Presently,  FastScat  replicates  the  matrices  in  every  node  (two  processors),  but  this 
is  probably  an  overkill. 
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Processors 

Time  (s) 

Speedup 

Efficiency  (%) 

1 

607.9 

1 

100 

2 

298.4 

2.0 

100 

4 

152.3 

4.0 

100 

8 

79.6 

7.6 

96 

16 

42.6 

14.3 

89 

32 

23.6 

25.9 

81 

Table  1 


Scalability  of  threaded  multilevel  FMM  for  a  r  =  16A  sphere. 


5.3  Scalable  Application  of  Translation  Operators 

Applying  the  translation  operators  in  a  scalable  way  is  more  problematic.  Here  the  fields  of  all  far  away 
groups  from  a  particular  group  are  translated,  converted  to  a  ./-expansion  valid  inside  the  group,  and  summed. 
It  is  likely  that  the  field  of  a  far  away  group  will  be  in  a  remote  node  which  makes  this  step  highly  cache 
sensitive.  If  naively  implemented,  the  application  of  translation  operators  scales  very  poorly.  Developing  a 
method  so  that  remote  fields  (fields  of  far  away  groups  that  are  stored  in  remote  nodes)  are  brought  into  the 
local  cache  and  reused  several  times  is  essential  to  the  overall  scaling  of  the  algorithm. 

A  simple  observation  is  the  key  to  scalability.  Consider  several  groups  that  are  neighbors,  i.e.  close 
together  in  space.  If  one  of  these  groups  needs  a  particular  remote  field,  it  is  likely  that  its  neighbors  will 
also  need  the  field  since  the  distances  between  the  neighbors  and  the  remote  group  are  roughly  the  same. 
The  essential  idea  is  to  translate  the  remote  field  to  all  of  the  neighbors  in  succession  which  brings  the  field 
into  the  cache  and  reuses  it  many  times.  To  do  this,  we  need  a  ordering  (numbering)  of  the  groups  for  each 
level  in  the  tree  that  keeps  groups  that  are  close  together  in  space  also  close  together  in  the  ordering.  Such  an 
ordering  is  given  by  a  breadth-first  traversal  of  the  group  tree.  This  is  analogous  to  the  Morton  order  reported 
in  [13], 

One  final  issue  has  to  do  with  the  small  size  of  the  cache.  The  basic  loop  for  applying  translation 
operators  applies  all  operators  to  a  group  m  before  moving  on  to  the  next  group  in  the  ordering.  It  must  be 
done  this  way  in  order  to  keep  gm  (the  far-field  representation  of  the  ./-expansion  for  the  group)  in  the  cache 
as  well.  Caches  are  too  small,  however,  to  keep  all  of  the  remote  fields  at  once,  defeating  the  purpose  of 
the  ordering.  The  solution  is  to  translate  only  a  piece  of  the  far-field  representation  of  a  far  away  group  at 
a  time.  The  specific  size  of  the  pieces  depends  primarily  on  the  cache  size,  but  using  a  piece  size  (kps)  of 
80  double  precision  complex  numbers  has  worked  well  in  practice  on  several  machines.  So,  at  a  given  level, 
the  ordering  is  traversed  translating  a  piece  of  the  far-field  representation  for  each  group.  At  the  end  of  the 
ordering,  the  process  moves  on  to  the  next  piece  of  the  representation.  This  is  repeated  until  all  the  far  fields 
have  been  translated  at  that  level.  The  code  then  continues  onto  the  next  level. 

In  detail,  translate  is  implemented  as  follows: 

for  (l  €  0 ...  H  —  1) 

for  (kk  =  0;  kk  <  Ki\  kk  =  kk  +  kps) 
ksize  =  min(kps ,  K\  —  kk) 
for  (m  e  first(Ci,p) . .  Aast(Cl}p)) 
for  {m  E  far(m)) 

for  (k  e  kk  . . .  kk  +  ksize  -  1) 

9mk  —  Qmk  T-m.  m'  k^m'  k 

where  Tmm'k  —  ^(k, Xm  -  Xm-).  These  quantities  are  computed  in  the  setup  phase. 

5.4  FMM  Parallel  Scalability  Results 

The  scaling  of  the  threaded  multilevel  FMM  apply  algorithm  is  shown  in  Table  1.  The  apply  time  in  seconds 
versus  the  number  of  processors  is  given  for  a  r  =  16A  sphere  discretized  by  153,600  unknowns.  The 
speedup  Sp  =  Tx/Tp  (where  Tp  is  the  apply  time  for  p  processors)  and  the  parallel  efficiency  1005p/p  are 
also  listed.  The  scaling  is  very  good,  with  32  processors  achieving  81%  efficiency. 

Tuned  and  naive  implementations  of  the  translation  operator  application  are  compared  in  Table  2  for  the 
same  problem.  The  table  shows  the  total  time  spent  by  all  processors  in  the  translate  step.  The  effort  to  apply 
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Processors 

i 

2 

4 

8 

16 

32 

Tuned  (s) 

82.5 

81.7 

83.7 

88.2 

94.5 

106.7 

Naive  (s) 

99.1 

110.4 

124.5 

158.8 

204.9 

271.0 

Table  2 

Time  spent  doing  translations  versus  number  of  processors  for  tuned  and  naive  implementations. 


Year 

1992 

1999 

Code 

Patch 

FastScat 

Computer 

Touchstone  Delta 

Origin  2000 

Processors 

512 

64 

Radius  (A) 

5.31 

60 

Area  (A2) 

354 

45,239 

Accuracy  (db  rms) 

2  (est) 

0.12 

Unknowns 

48,673 

2,160,000 

Memory  (Gb) 

38 

45.5 

Time  (hrs) 

19.6 

27.9 

Table  3 

State  of  the  Art:  1992  vs.  1999 


the  operators  grows  by  29.3%  as  the  number  of  processors  increases  from  1  to  32  (the  elapsed  time  is  82.5s 
for  1  processor  and  3.33s  for  32  processors).  In  contrast,  if  the  operators  are  applied  without  ordering  the 
groups  or  dividing  up  the  far  field  representation  for  cache  efficiency,  the  effort  to  apply  the  operators  grows 
173%  and  begins  to  take  a  substantial  fraction  of  the  total  FMM  time. 

6  Electromagnetic  Scattering  Results 

FastScat  was  used  to  compute  the  bistatic  RCS  of  a  r  =  60A  sphere  for  both  polarizations  on  a  64  processor 
SGI  Origin  2000.  Table  3  shows  information  from  the  run  including  problem  area,  accuracy  (as  compared  to 
the  Mie  series  solution),  number  of  unknowns,  memory  required,  and  run  time.  It  is  compared  to  the  1992 
result  from  the  Patch  code  on  the  Touchstone  Delta.  The  FastScat  run  times  by  phase  were  20.2  hours  for 
setup  (mostly  computing  local  corrections),  7.66  hours  for  the  solve  (computation  of  surface  currents  using 
the  FMM),  and  1.04  hours  to  compute  the  bistatic  RCS  at  1,800  angles.  Figure  1  plots  the  computed  RCS 
versus  the  Mie  series  solution.  The  two  curves  are  nearly  identical. 

The  Patch  code  used  a  tuned  out-of-core  solver  to  factor  Z.  The  solver  was  carefully  constructed  to 
overlap  disk  I/O,  interprocessor  communication,  and  computation,  to  achieve  high  performance.  It  sustained 
a  rate  of  10.35  Gflops,  which  was  within  a  factor  of  2  of  the  theoretical  maximum  rate  of  the  Delta  for  the 
inner  loop  of  the  computation.  The  Patch/Delta  result  represented  the  largest  reported  scatterin'*  run  to  date 
in  1992. 

It  would  take  Patch/Delta  some  time  to  match  the  FastScat  result  in  both  size  and  accuracy.  In  order 
for  Patch  to  achieve  an  accuracy  of  roughly  0.2dB,  the  number  of  unknowns  would  have  to  be  increased  by 
about  a  factor  of  10  due  to  the  0(h?)  convergence  rate  of  its  discretization.  The  difference  in  area  is  over  a 
factor  of  100.  Taken  together,  the  unknown  count  must  increase  ~  1000  fold.  Since  the  factorization  process 
is  0(N  ),  the  run  time  can  be  expected  to  increase  by  roughly  nine  orders  of  magnitude. 

We  have  used  FastScat  to  compute  the  RCS  of  a  variety  of  benchmark  targets.  Figure  2  shows  the 
currents  induced  on  the  Dart,  a  standard  test  case,  at  18  GHz  with  the  incident  radiation  nose-on.  At  this 
frequency,  the  Dart  is  4441A2  in  area  and  is  discretized  by  436,000  unknowns.  Figure  3  show  the  monstatic 
RCS  in  both  polarizations  using  an  over-the-top  scan.  This  scan  goes  from  the  back  at  -90  degrees  to  the  tip  at 
90  degrees.  By  using  convergence  studies,  which  are  relatively  inexpensive  with  a  high  order  discretization, 
the  error  has  been  estimated  at  approximately  0.1  dB  in  the  high  RCS  regions  and  roughly  2  dB  in  the  stealthy 
regions  (near  the  tip).  FastScat  required  8.3  Gb  of  memory,  did  the  setup  in  3.0  hours,  and  solved  for  each 
monostatic  angle  in  an  average  of  17  minutes.  A  32  processor  Origin  2000  was  used.  At  436,000  unkowns, 
the  18  GHz  Dart  is  too  large  for  dense  matrix  techniques  even  on  the  biggest  supercomputers. 
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- Mie  Series 

- FastScat 


FIG.  1 .  Computed  RCS  of  a  r  =  60  A  sphere  compared  to  the  Mie  series  solution . 


FIG.  2.  Computed  surface  currents  of  the  Dart  at  18  GH: 
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Dart  @>  1 8  GHz  :  Theta  Scan 


Fig.  3.  RCS  of  the  Dart  at  18  GHz  in  both  polarizations  using  over-the-top  scan  (<jxf>  polarization  shifted  -20dB 
for  legibility). 


All  of  the  runs  in  this  section  were  done  in  a  production  environment  where  FastScat  was  sharing 
machine  resources  with  other  jobs.  Generally  the  load  average  did  not  exceed  the  number  of  processors, 
but  this  was  not  always  the  case. 

7  Concluding  Remarks 

A  purpose  of  this  paper  is  to  put  forth  a  more  general  notion  of  scalability.  Parallel  scalability  is  important 
since  only  scalable  parallel  codes  utilize  large,  expensive  computers  effectively.  But  Moore’s  law  and  big 
iron  are  no  match  for  algorithmic  scalability. 

The  Helmholtz  FMM  and  contemporary  large  computers  are  complementary.  Consider  a  slow  0(jV3) 
method  with  a  small  prefactor.  For  these  methods,  large  computers  confer  little  advantage.  A  modest  increase 
in  the  number  of  unknowns  quickly  exceeds  the  capacity  of  even  the  largest  machine.  As  a  result  of  increased 
microprocessor  performance  and  microprocessor  count  (from  a  few  hundred  to  a  few  thousand),  modern 
supercomputers  are  nearly  100  times  faster  than  the  Delta.  Yet  even  on  these  machines,  codes  that  do  not 
take  advantage  of  the  algorithmic  advances  can  only  do  problems  about  4  times  larger  than  what  the  Delta 
did  in  1992.  In  contrast,  the  Helmholtz  FMM  has  superior  asymptotic  complexity  but  a  large  prefactor.  It 
takes  a  fairly  big  machine  just  for  the  FMM  to  breakeven  with  respect  to  the  slow  method.  But  the  benefit  is 
that  you  can  move  out  to  much  larger  problems  and  still  stay  within  the  available  machine  resources.  High 
accuracy  solutions  for  problems  exceeding  a  million  square  wavelengths  are  possible  on  the  largest  present 
day  machines  with  modem  algorithms. 

The  FastScat  development  effort  is  continuing  in  the  areas  of  modeling  subwavelength  structures  such 
as  edges  and  gaps,  and  in  the  incorporation  of  material  properties.  We  see  no  reason  why  these  extensions 
can  not  also  be  accomplished  in  a  scalable  way. 
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The  need  to  filter  functions  defined  on  the  sphere  arises  in  a  number  of  applications 

^s  RecemW  7h IT8'  and  acoustic  ottering,  and  several  other 

on  7  h  u'  "  6d  that  ‘he  Pr°blem  ofuniform  resolution  filtering 
n  the  sphere  can  be  performed  efficiently  via  the  fast  multipole  method  (FMM)  in 

to  *r*ension-  n  th's  paper,  we  introduce  a  generalization  of  the  FMM  that  leads 
to  an  accelerated  version  of  the  filtering  process.  Instead  of  mult, pole  expansions 

skion  ofT  US6S  SPT';PUrP0Se  baS6S  constructed  via  the  singular  value  decompo- 
s  tion  of  appropriately  chosen  submatrices  of  the  filtering  matrix.  The  algorithm  is 

vdAsevIr  T  3  ^  daSS  °f  pr°jeCtion  °Perators:  its  Perfotmance  is  illustrated 

with  several  numerical  examples.  ©  1998  Acad™ic  rv=Ss 

Key  Words:  singular  value  decompositions;  fast  algorithms;  spherical  harmonics. 


1.  INTRODUCTION 

2SS  d  °nS  ,Whil'  'he  1W°' and  thre«iimensional  variants  have  found 

numeral 7*  “  ‘  °f  — 

u  i  ^  Ce’  TOr  examPle’  [3]).  One  such  use  of  the  one-dimensional 

has  recent,,  teen  published  h,  Jakob-Chien  and  Alp*.  [10J.  in  an  afoonZfor ”13 

uniform  resolution  filtenng  and  interpolation  of  functions  on  the  sphere'  that  afoorithm  L 
uses  in  the  solntton  of  panial  differential  equations  on  the  sphere  [131,  in  fast  algorithms  for 
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electromagnetic  scattering  [4],  and  in  several  other  environments.  In  this  paper,  we  describe 
a  version  of  the  one-dimensional  FMM  which  has  been  generalized  so  as  to  calculate 
not  only  electrostatic  potentials,  but  a  wide  class  of  similar  kernels,  and  we  describe  an 
accelerated  version  of  the  algorithm  of  [10]  in  which  two  subroutine  calls  to  the  original 
one-dimensional  FMM  are  replaced  by  one  call  to  the  generalized  FMM. 

Formally,  this  paper  describes  an  algorithm  for  the  following  task:  given  an  n  x  m  matrix 
P  of  a  certain  structure  and  given  a  desired  accuracy  s,  compress  P  so  that  its  product  with  a 
vector  can  be  efficiently  computed  to  that  accuracy.  The  structure  the  algorithm  requires  of 
P  is  as  follows:  there  must  exist  numbers  x\  <X2<  •  •  •  <  xm  and  vj  <  y2  <  •  -  •  <  y„  such 
that,  roughly  speaking,  any  submatrix  of  P  which  is  separated  in  index  space  from  the  line 
Xj  =  \j  by  a  distance  greater  than  its  own  size  has  a  rank  less  than  some  (reasonably  small) 
number  r,  to  the  precision  e:  the  CPU  time  taken  by  the  algorithm  for  multiplication  of  P 
by  a  vector  is  then  0  (nr).  (A  rigorous  accounting  of  the  execution  time  of  the  algorithm  is 
somewhat  complicated  and  is  given  in  Section  3.2.6.)  One  matrix  P  —  [pij]  which  has  such 
a  structure  is  given  by  the  formula 


and  is  the  matrix  whose  multiplication  by  a  vector  is  implemented  by  the  original  one¬ 
dimensional  versions  of  the  FMM. 

This  paper  is  arranged  as  follows.  Section  2  briefly  reviews  numerical  tools  used  by  the 
algorithm.  Section  3  describes  the  generalized  FMM  in  its  basic  form.  Section  4  describes 
modifications  to  the  algorithm  of  Section  3,  the  principal  one  of  which  is  the  diagonalization 
of  roughly  a  third  of  the  interaction  matrices.  Section  5  contains  numerical  results  for 
the  generalized  FMM  applied  to  the  matrix  (1).  Section  6  describes  modifications  to  the 
algorithm  of  [10]  which  incorporate  the  generalized  FMM.  Finally,  Section  7  examines 
generalizations  of  the  schemes  presented  in  this  paper. 


2.  NUMERICAL  PRELIMINARIES 

2.1.  Singular  Value  Decomposition 

The  singular  value  decomposition  (S' VD )  is  a  ubiquitous  tool  in  numerical  analysis,  given 
for  the  case  of  real  matrices  by  the  following  lemma  (see,  for  instance,  [  1 4]  for  more  details). 

LEMMA  2.1.  For  any  n  x  m  real  matrix  A,  there  exist  an  integer  p,  an  n  x  p  real  matrix 
U  with  orthonormal  columns .  an  m  x  p  real  matrix  V  with  orthonormal  columns .  and  a 
px  p  real  diagonal  matrix  S  =  [sjj}  whose  diagonal  entries  are  nonnegative .  such  that 
A  =  USV *  and  that  Sa  >  Si+u+ \  for  all  i  =  1 . p  —  1. 

The  diagonal  entries  Sa  of  S  are  called  singular  values  of  A ;  the  columns  of  the  matrix 
V  are  called  right  singular  vectors;  the  columns  of  the  matrix  U  are  called  left  singular 
vectors. 

2.2.  Least  Squares  Approximation 

This  section  contains  three  lemmas  on  the  least  squares  approximation  of  matrices,  proven 
in  a  more  general  setting  in  [15].  In  this  section  and  in  the  remainder  of  the  paper  R,I,WI  will 
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denote  the  space  of  all  real  n  x  m  matrices,  and  the  matrix  norm  used  will  be  the  Schur  or 
Frobenius  norm:  that  is.  for  an  n  x  m  real  matrix  A  = 

n  m 

i'"i  =  >EZ4  <?> 

\  '  =  1  7  =  1 

Lemma  2.2.  Suppose  A  is  a  pxn  real  matrix .  B  is  an  m  x  k  real  matrix .  and  C 
is  a  px  k  real  matrix .  for  some  w.  p.  n,  and  k.  Let  A  =  UASA  V*  be  a  singular  value 
decomposition  of  A.  and  let  B  —  0BSBVB  be  a  singular  value  decomposition  of  B.  Let  r 
be  the  number  of  nonzero  singular  values  of  A .  and  let  q  be  the  number  of  nonzero  singular 
values  of  B.  Let  UA  and  VA  consist  of  the  first  r  columns  ofUA  and  VA.  respectively ,  and 
let  5,a  consist  of  the  first  r  rows  of  the  first  r  columns  ofSA.  Let  UB  and  VB  consist  of  the 
first  q  columns  of  UB  and  VB.  respectively .  and  let  SB  consist  of  the  first  q  rows  of  the  first 
q  columns  of  S B.  Then  the  solution  X  of  the  minimization  problem . 

min  || AXB  -  C||, 

is  given  by 

x  =  vAsjlu*cvBs-Blu;.  (4) 

Furthermore . 

II AXB  -  C\\  =  ||C  -  UaLTaCVbV*b ||.  (5) 

The  following  lemma  provides  a  bound,  in  certain  situations,  on  the  error  of  the  approx¬ 
imation  given  by  Lemma  2.2. 

LEMMA  2.3.  Under  the  conditions  of  Lemma  2.2,  suppose  that  there  exist  an  n  x  k 
matrix  D  and  an  p  x  m  matrix  E  such  that 

WAD  -  C||  <  ex  (6) 

and 

\\EB  -  C||  <  e2.  (7) 

Then 

II  AXB  -  C||  <  £\  +  £ 2 •  (8) 

As  shown  by  the  following  lemma,  the  error  bound  of  Lemma  2.3  also  applies  when  a 
different  formula  for  the  minimizing  matrix  is  used. 

LEMMA  2.4.  Under  the  conditions  of  Lemma  2.3.  let  the  n  x  m  matrix  Y  be  given  bv 
the  formula 

Y  =  DVbS-b'U'b. 

Then 

\\AY B  —  C\\  <£|+£2 


(9) 

(10) 
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3.  BASIC  FMM 

This  section  describes  the  generalized  FMM  of  this  paper.  It  is  described  as  a  set  of 
modifications  to  the  FMM  of  [6. 3];  the  reader  is  assumed  to  be  familiar  with  that  algorithm. 

The  overall  FMM  structure  of  an  upward  pass  for  creation  of  far  field  expansions,  fol¬ 
lowed  by  a  pass  which  computes  local  expansions  from  far  field  expansions,  followed  by 
a  downward  pass  which  propagates  local  expansions  to  lower  levels  and  evaluates  them,  is 
retained.  However,  all  the  expansions  are  different,  being  based  on  singular  value  decompo¬ 
sitions  rather  than  on  analytical  formulae.  In  addition,  the  hierarchical  subdivision  scheme 
is  different,  being  performed  according  to  matrix  indices  rather  than  according  to  point 
locations.  (The  expansions  used  permit  almost  any  subdivision  scheme,  whether  adaptive 
as  in  [15].  or  nonadaptive  as  in  [3]:  the  present  scheme  was  chosen  solely  for  its  simplicity.) 

3.1.  Subdivision  Scheme 

The  hierarchical  subdivision  is  performed  on  column  indices  of  the  matrix  P.  as  follows: 

•  Each  interval  of  column  indices,  if  it  is  divided,  is  divided  into  two  intervals  of  equal 
size  (or  differing  in  size  by  one.  if  the  number  of  indices  in  the  interval  is  odd). 

•  The  subdivision  is  uniform;  either  all  the  intervals  at  any  given  depth  of  the  tree  are 
subdivided,  or  none  are. 

•  The  subdivision  process  continues  until  the  lowest-level  intervals  are  as  close  as  pos¬ 
sible  to  a  user-chosen  size. 

For  each  interval  [j j .  j2]  of  column  indices  produced  by  the  above  process,  a  correspond¬ 
ing  interval  [/j .  /2]  of  row  indices  is  chosen  such  that  the  portion  of  P  addressed  by  the  two 
intervals  of  indices  contains  as  much  as  possible  of  the  line  x 7  =  yjt  The  precise  criterion 
used  to  choose  the  interval  [/ 1 .  i2]  is  that  it  should  be  the  interval  of  maximal  size  such  that 

(-T/1-1  +  xj\)/2  ^  )'i\  <  •  •  •  <  y,-:  <  (xjz  +  Xj2+ 1)/2.  (11) 

(If  xy,_i  or  Xfr+i  does  not  exist,  the  corresponding  inequality  in  the  above  equation  is  not 
enforced.  The  quantities  x\  <x2  <  •  *  •  <xm  and  Vj  <y2<  •  •  *  <  y„  were,  in  the  present 
implementation,  user-provided;  in  an  environment  where  they  are  not  readily  available,  they 
can  be  determined  by  numerically  searching  P  for  areas  of  high  numerical  rank.) 

3.2.  Expansions 

This  section  describes  the  expansions  used  in  the  generalized  FMM.  Submatrices  of  P 
will  be  designated  as  follows:  Pa,b  denotes  the  portion  of  P  whose  column  indices  are  in  b 
and  whose  row  indices  are  in  a,  where  a  and  b  are  either  intervals  of  indices  into  P,  or  sets 
thereof. 

For  each  interval,  the  FMM  divides  the  intervals  at  the  same  depth  in  the  tree  into  two 
sets: 

•  1 .  The  near  field  region,  consisting  of  the  interval  itself  and  the  two  adjacent  intervals 
at  the  same  depth  in  the  tree  of  intervals. 

•  2.  The  far  field  region,  consisting  of  all  remaining  intervals  at  the  same  depth  in  the 
tree.  We  denote  the  far  field  region  of  the  Fth  interval  by  F(  . 
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A  third  set  is  also  required:  the  interaction  list  of  an  interval  i  is  the  set  of  intervals  at  the 
same  depth  in  the  tree  which  are  in  the  far  field  of  i  and  which  are  not  in  the  far  field  of  the 
parent  of  i. 

3.2.1.  Far-field  expansions.  The  original  FMM  [6]  relies  on  the  fact  that  the  electrostatic 
potential  due  to  a  set  of  charges  can  be  represented  to  high  precision,  at  points  distant  from 
those  charges,  by  a  multipole  expansion  of  relatively  few  terms.  In  the  generalized  FMM 
described  in  this  paper,  the  output  (no  longer  necessarily  the  electrostatic  potential,  although 
we  will  continue  to  use  the  terms  “potential”  and  “charge”  for  convenience)  does  not  need  to 
be  describable  by  a  multipole  expansion,  but  can  be  describable  by  an  arbitrary  expansion, 
provided  that  the  expansion  coefficients  are  linear  functions  of  the  charge  magnitudes 
and  that  the  potential  is  a  linear  function  of  the  expansion  coefficients.  The  creation  and 
evaluation  matrices  for  this  expansion,  which  we  will  call  a  far-field  expansion,  do  not  need 
to  be  furnished  as  such  by  the  user:  they  are  computed  from  the  matrix  P  using  the  singular 
value  decomposition.  This  computation  is  performed  for  each  interval  /  for  which  a  far-field 
expansion  is  needed  and  is  as  follows:  Let  n,  x  m,  be  the  dimensions  of  the  matrix  PFi  ,  , 
let  the  singular  value  decomposition  of  PFj  be  denoted  by  OSV *.  the  number  of  singular 
values  by  p,  and  the  singular  values  by  si  >  •  >  Sp.  Let  p ,  be  the  minimum  integer 

such  that 


Y  *j<e2\\Pf 


j=p,+ 1 


»,m, 

nm 


(12) 


Let  the  m,  x  p,  matrix  V,  consist  of  the  first  p,  columns  of  V  and  let  the  p,  x  n,  matrix 
£,  consist  of  the  first  p,  columns  of  the  product  US.  We  will  refer  to  V*  as  the  far-field 
expansion  creation  matrix  for  interval  i  and  to  £,  as  the  far-field  evaluation  matrix;  the  latter 
is  not  used  explicitly  in  the  algorithm. 

As  shown  in  [8],  the  product  £■  V*  is,  among  matrices  of  rank  p,.  the  closest  approxima¬ 
tion  to  the  matrix  PFii  in  the  norm  (2).  Thus  the  number  of  terms  in  any  known  expansion 
f°r  P y, ,i  (such  as  a  multipole  expansion)  is  an  upper  bound  for  the  number  of  terms  p,  in 
the  far-field  expansion  of  the  same  accuracy  computed  as  above. 

3.2.2.  Local  expansions.  Using  far-field  expansions  alone,  an  0(n  ■  log  n )  version  of 
the  FMM  can  be  produced  (for  an  overview  of  the  various  versions  see  [7]).  The  O(n) 
version  of  the  FMM  requires  additional  numerical  machinery,  namely  local  expansions, 
which  approximate  the  potential  on  a  region  due  to  charges  on  distant  regions.  In  the  original 
FMM.  local  expansions  were  harmonic  expansions;  in  the  generalized  FMM.  creation  and 
evaluation  matrices  for  local  expansions  are  computed  from  the  matrix  P  using  the  singular 
value  decomposition,  as  follows.  Let  n\  x  m]  be  the  dimensions  of  the  matrix  PLFl ;  let  the 
singular  value  decomposition  of  P,  F:  be  denoted  by  US  V*,  the  number  of  singular  values 
by  r .  and  the  singular  values  by  r,  >  s2  >  •  •  •  >  s-r.  Let  r,  be  the  minimum  integer  such  that 


Y  sj  <  f2n^n: 

y=r;+l 


nm 


(13) 


Let  the  m]  x  r,  matrix  U,  consist  of  the  first  r,  columns  of  U.  We  will  refer  to  U,  as  the 
local  expansion  evaluation  matrix  for  interval  i. 


GENERALIZED  1 D  FMM  AND  SPHERICAL  FILTER 


599 


3.2.3.  Far-field  translation  matrices.  The  FMM  does  not  compute  far-held  expansions 
for  intervals  at  high  levels  in  the  tree  directly  from  the  charges  in  the  interval,  but  rather 
computes  them  from  far-held  expansions  at  lower  levels.  Associated  with  each  interval  i 
whose  parent  interval  j  has  a  far-held  expansion  is  a  translation  matrix  T.  which  takes  as 
input  a  far-held  expansion  for  /  and  produces  as  output  a  far-held  expansion  for  j  which 
evaluates  to  the  same  potential.  Let  V*  be  the  far-held  creation  matrix  for  interval  i .  and 
let  VJj  be  the  far  held  creation  matrix  for  interval  j.  with  columns  deleted  such  that  it 
only  accepts  input  from  the  interval  t.  Clearly  the  translation  matrix  T,  should  be  such  that 
for  any  m, -vector  q.  the  vector  7}  V’q  is  as  close  as  possible,  by  some  measure,  to  the 
vector  VJjq.  The  measure  we  use  is  the  least  squares  measure:  in  particular.  T,  is  chosen 
so  as  to  minimize  the  quantity  ||  V'j|(  —  7}  V*  || .  The  formula  for  such  minimization  is  given 
by  Lemma  2.2:  using  the  fact  that  the  singular  value  decomposition  of  any  matrix  with 
orthogonal  columns  consists  of  that  matrix  multiplied  by  two  identity  matrices,  it  reduces 
in  this  case  to 


T: 


v*.v 

j.i  vi‘ 


(14) 


We  will  refer  to  T.  as  the  far-held  expansion  translation  matrix  for  interval  i. 

Lemma  2.4  gives  a  bound  for  the  error  associated  with  using  the  translation  matrix  T, . 
Suppose  E jk  and  El  k  are  matrices  which  take  as  input  the  far-held  expansions  on  interval  j 
and  on  interval  i.  respectively,  and  use  them  to  evaluate  the  potential  on  some  other  interval 
k  and  are  such  that 


<e,  (15) 

\\Pi,k  -  E^V'W  <  e2.  (16) 

Using  (15),  (16).  and  Lemma  2.4.  we  get  that 

II  Pi.k  —  EjkTj  V’  ||  <  £]  +  (17) 

3.2.4.  Local  expansion  translation  matrices.  The  FMM  does  not  evaluate  local  expan¬ 
sion  for  intervals  at  high  levels  in  the  tree  directly  at  each  of  the  points  at  which  the  potential 

is  to  be  evaluated,  but  rather  transforms  them  into  local  expansions  for  intervals  at  lower 

levels.  Associated  with  each  interval  i ,  whose  parent  interval  j  has  a  local  expansion,  is  a 
translation  matrix  A/,  which  takes  as  input  a  local  expansion  on  j  and  produces  as  output  a 
local  expansion  on  i.  A/,  is  computed  as  follows.  Let  U,  be  the  local  expansion  evaluation 
matrix  for  interval  i,  and  let  Uh,  be  the  local  expansion  evaluation  matrix  for  interval  j . 
with  rows  deleted  so  that  it  only  produces  output  on  the  interval  i.  Clearly  the  translation 
matrix  M ,  should  be  such  that  for  any  r, -vector  a,  the  vector  U,  M, a  is  as  close  as  possible, 
by  some  measure,  to  the  vector  Ujja.  The  measure  we  use  is  the  least  squares  measure;  in 
particular,  M,  is  chosen  so  as  to  minimize  the  quantity  ||  Ujj  -  U,  M,  || .  The  formula  for  such 
minimization  is  given  by  Lemma  2.2.  Using  the  fact  that  the  singular  value  decomposition 
of  any  matrix  with  orthogonal  columns  consists  of  that  matrix  multiplied  by  two  identity 
matrices,  it  reduces  in  this  case  to 
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The  error  incurred  by  using  A/,  is  bounded  by  Lemma  2.4:  the  analysis  is  almost  identical 
to  that  presented  in  Section  3.2.3  for  the  far-tield  translation  matrix  T,  and  is  omitted.  We 
will  refer  to  M,  as  the  local  expansion  translation  matrix  for  interval  /. 

3.2.0.  Far-field  to  local  interaction  matrices.  A  far-held  to  local  interaction  matrix  £ , 
takes  as  input  a  far-held  expansion  on  an  interval  i  and  produces  as  output  a  local  expansion 
on  another  interval  j.  Such  matrices  are  constructed  only  for  pairs  of  intervals  (/.  /  >  such 
that  j  is  in  the  interaction  list  of  i .  The  matrix  E / ,  should  be  such  that  for  all  m,  -vectors  q 
the  product  UjEjjV'q  is  as  close  as  possible,  by  some  measure,  to  the  product  P,,q.  We 
choose  Ej  j  so  as  to  minimize  the  quantity 

fj.i  =  II 6 j E j ' i  V j  —  P 'j  ,||.  ( 19) 

The  formula  for  such  minimization  is  given  by  Lemma  2.2:  using  the  fact  that  the  sin¬ 
gular  value  decomposition  of  any  matrix  with  orthogonal  columns  consists  of  that  matrix 
multiplied  by  two  identity  matrices,  it  reduces  in  this  case  to 


Ej.i  =  u;pj.ivh 


Lemma  2.3.  combined  with  (12)  and  ( 13).  gives  a  bound  for  f , ,  : 


£j.i  <f||P|| 


!n,m, 

\  nm 


(20) 


(21) 


We  will  refer  to  Ej ,  as  the  far  held  to  local  interaction  matrix  from  interval  /  to  interval  j. 

Remark  3.1.  A  brief  inspection  of  the  above  formulae  for  the  creation,  translation, 
and  evaluation  matrices  {£/,}.  {T,}.  { 7/ } .  {A/,},  and  {£;.,}  shows  that  the  same  matrices 
are  generated,  in  different  roles,  if  the  input  matrix  to  the  algorithm  is  the  adjoint  P‘  of 
P.  provided  that  the  hierarchical  subdivision  is  retained:  the  far  held  expansion  creation 
matrices  for  P  are  identical  to  the  local  expansion  evaluation  matrices  for  P’.  and  vice 
versa,  the  far  held  translation  matrices  for  P  are  identical  to  the  local  expansion  translation 
matrices  for  P*,  and  vice  versa:  and  the  far  held  to  local  matrices  for  P  are  the  adjoints  of 
the  far  held  to  local  matrices  for  P\ Thus  the  matrices  precomputed  for  P  can  also  be  used 
for  multiplying  by  Pm. 

3.2.6.  Execution  time.  The  FMM  performs  one  matrix-vector  multiplication  for  each 
instance  of  the  matrices  {[/,}.  {V,}.  {7}}.  {A/,},  and  {£,.,}.  Thus  the  CPU  time  which  it  con¬ 
sumes  is  proportional  to  the  total  number  of  elements  in  all  instances  of  the  matrices.  The 
sizes  of  the  matrices  depend  on  the  numerical  ranks  p,  and  rh  as  dehned  by  (12)  and  (13). 
We  analyze  the  execution  time  further  only  in  the  case  that  all  those  ranks  are  all  bounded 
by  some  number  r .  In  that  case,  the  computation  of  far-held  expansions  from  the  input  takes 
0(mr)  time,  the  computation  of  the  output  from  local  expansions  takes  O(nr)  time,  and  the 
computations  of  expansions  from  other  expansions  take  0(kr2)  time,  where  k  is  the  total 
number  of  intervals  produced  by  the  subdivision  process.  Assuming  that  m  is  proportional  to 
/t.  the  total  execution  time  is  0(nr  +  kr2).  The  quantity  nr  +  kr2  is  minimized  (with  respect 
to  k )  when  n/k  is  equal  to  r.  Since  n/k  is  proportional  to  the  size  of  the  lowest-level  intervals, 
the  minimum  execution  time  occurs  when  the  size  of  the  lowest-level  intervals  is  propor¬ 
tional  to  r.  with  the  constant  of  proportion  depending  on  the  details  of  the  computer  involved. 
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4.  TECHNICAL  IMPROVEMENTS 

4. 1 .  Diagonalization  of  Far  Field  to  Local  Matrices 

A  certain  amount  of  freedom  is  present  in  the  definition  of  far  field  and  local  expansions: 
the  results  of  the  FMM  are  clearly  unaffected  if  the  far-field  expansion  creation  matrix  f" 
for  an  interval  /  is  multiplied  on  the  left  by  any  orthogonal  matrix  IF.  its  far  field  translation 
matrix  T )  is  multiplied  on  the  right  by  Vi".  and  its  far  field  to  local  matrices  E ,  ,  for  all  j  are 
multiplied  on  the  right  by  W Similarly,  the  results  of  the  FMM  are  unaffected  if  the  local 
expansion  evaluation  matrix  U,  for  an  interval  i  is  multiplied  on  the  right  by  any  orthogonal 
matrix  W .  its  local  expansion  translation  matrix  A/,-  is  multiplied  on  the  left  by  IF*.  and  its 
far  field  to  local  matrices  E,  ,  for  all  j  are  multiplied  on  the  left  by  If". 

We  use  this  freedom  to  diagonalize  one  of  the  (usually  three)  far  field  to  local  matrices  for 
each  interval.  Suppose  that  Et  ]  for  some  intervals  i  and  j  is  the  matrix  to  be  diagonalized. 
Let  its  singular  value  decomposition  be  denoted  by  £,  ,  =  USV*.  Then  we  multiply  l"  on 
the  right  by  V*.  and  multiply  U,  on  the  left  by  U.  also  changing  translation  matrices  and 
far  field  to  local  matrices  as  indicated  in  the  previous  paragraph  so  that  the  results  of  the 
FMM  are  unaffected. 

Far  field  to  local  matrices  are  chosen  for  diagonalization  in  such  a  way  that  each  expansion 
redefined  by  this  process  is  redefined  only  once.  The  scheme  used  is  as  follows:  each  level  of 
intervals  is  divided  into  blocks  of  four  adjacent  intervals:  inside  each  block  the  interactions 
chosen  for  diagonalization  are:  1  -»  3.  2  -►  4.  3  1 .  and  4  -►  2  (as  depicted  in  Fig.  1 ). 

4.2.  Splits  by  Factors  Other  Than  Two 

Another  modification  which  was  made  to  the  above  FMM  is  to  split  intervals  into  more 
than  two  pieces.  This  clearly  can  be  done  to  any  interval,  at  any  level  in  the  tree.  However, 
the  only  use  which  was  made  of  this  flexibility  was  to  alter  the  top  of  the  tree  of  intervals 
slightly,  so  as  to  control  better  the  size  of  the  lowest-level  intervals  in  the  tree.  The  top 
interval  was  split  either  into  two,  three,  or  five  pieces:  if  three,  its  subintervals  might  each 

12  3  4 


12  3  4 


FIG.  1.  Far  field  to  local  operators  which  are  diagonalized. 
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TABLE  I 


Double  Precision  Timings  for  the  l/x  Kernel 


N 

Error 
( L :  norm) 

Times  (seconds) 

Ratio 

eval/FFT 

Memory 

(REAL*8 

spaces) 

Init 

Eval 

Direct 

64 

0.35477E-15 

0.070 

0.001 

0.001 

5.21 

3852 

128 

0.92042E-15 

0.820 

0.003 

0.005 

7.31 

10407 

256 

0.235 1 2E- 1 4 

6.620 

0.007 

0.019 

8.93 

26205 

512 

0.16144E-13 

39.700 

0.013 

0.073 

5.60 

52263 

1024 

0.2 1 925E- 1 3 

214.710 

0.031 

0.730 

4.16 

117881 

be  split  into  three  parts,  the  remaining  intervals  in  the  tree  all  being  split  into  two  parts.  This 
permits  a  choice  of  the  size  of  the  lowest-level  intervals  not  only  of  n /2k  for  any  k ,  but  also 
of  nf  (3  x  2k),n/(5  x  2*),  or/i/(9  x  2k). 


5.  NUMERICAL  RESULTS 

For  comparison  against  the  older  one-dimensional  FMMs  of  [3,  15],  the  generalized 
FMM  was  applied  to  the  \/x  kernel;  that  is.  the  input  matrix  P  =  [pij]  was  given  by  (1). 
Timings  for  various  numbers  of  points  n  are  listed  in  Tables  I  and  II  for  double  and  single 
precision  (that  is,  with  the  parameter  e  set  to  10" 14  and  10~7).  In  all  cases,  the  parameter 
m  was  set  to  be  equal  to  /?.  the  nodes  {.*,■}  were  identical  to  the  nodes  {v/},  being  slightly 
perturbed  equispaced  nodes.  All  timings  were  performed  on  a  Sun  Sparcstation  10  in  double 
precision  (Fortran  REAL*8)  arithmetic.  Also  included  in  the  tables  are  ratios  of  the  execution 
time  of  the  algorithm  to  the  execution  time  of  a  standard  SLATEC  FFT  of  size  n. 

From  the  timings,  it  can  be  seen  that  the  generalized  FMM  is  similar  in  execution  speed 
to  the  best  previous  ID  FMM  (that  of  [15])  known  to  the  authors.  It  is,  however,  far  inferior 
to  the  FMMs  of  [3.  15]  in  the  time  spent  in  the  precomputation  stage;  initialization  times 
for  those  algorithms  did  not  exceed  execution  time  by  more  than  a  factor  of  10,  whereas  the 
initialization  time  for  the  generalized  FMM  exceeds  the  execution  time  by  factors  of  1000s. 
Effectively,  it  limits  the  usefulness  of  the  procedure  of  this  paper  to  problems  of  sufficient 
importance  that  the  initialization  data  can  be  precomputed  and  stored.  The  following  section 
discusses  one  such  case. 


TABLE  II 

Single  Precision  Timings  for  the  1/r  Kernel 


N 

Error 
{L:  norm) 

Times  (seconds) 

Ratio 

eval/FFT 

Memory 
(REAL *8 

spaces) 

Init 

Eval 

Direct 

64 

0.25040E-08 

0.040 

0.001 

0.001 

4.74 

3500 

128 

0.23352E-07 

0.440 

0.002 

0.005 

5.90 

8465 

256 

0. 19125E-06 

3.580 

0.005 

0.018 

6.13 

17803 

512 

0.64886E-06 

22.710 

0.010 

0.074 

4.03 

36911 

1024 

0.289 10E-06 

124.690 

0.021 

0.590 

2.77 

79407 
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6.  APPLICATION  TO  FILTERING 

This  section  describes  a  use  of  the  generalized  FMM.  in  an  algorithm  recently  pub¬ 
lished  by  Jakob-Chien  and  Alpert  [10]  for  uniform  resolution  filtering  of  functions  on  the 
sphere.  Their  algorithm  as  a  whole  performs  the  following  task:  given  numbers  f{(p,.  9,). 
‘  —  1 . /;  j =  1 . J,  such  that 

K  n 

m.8j)  =  Y,  E  KKUi-ej).  (22) 

n — 0 m~~n 

computes  numbers  fit,.  §})  such  that 


A'  n 

f(0,  ■§j)  =  Y,  E  /"  C  (0,  •  0j )■  (23) 

;i=0  W  =— n 

where  the  functions  T”  are  the  surface  harmonics  and  where  {0,},  {#,),  {0(.},  and  [0,]  are 

appropriately  chosen  grid  points  (see  [10]  for  details). 

We  modify  only  the  core  of  the  algorithm  of  [10],  which  performs  the  following  one¬ 
dimensional  filtering  operation:  given  numbers  fm{9\) _ _  fm(9j)  such  that 

y-i 

/"(£,)  =  E /;  A* Ou,).  «  =  1 . J.  (24) 

j=m 

compute  numbers  fm(9l) . fm(9N)  such  that 


/'"(<?,.)  =  E/f^’(£.').  /  =  1 . N.  (25) 

j—m 

where  the  functions  P'”  are  the  normalized  associated  Legendre  functions,  m,  =  sin  9,  and 
M,  =  sin#;. 

Due  to  the  orthonormality  of  the  functions  P«  for  fixed  m  and  integer/!  >  m,  if  the  nodes 

Mi, - My  are  Legendre  nodes  (nodes  of  the  Gaussian  quadrature  corresponding  to  the 

weight  function  co(x)  =  1 ;  see,  for  instance,  [14]),  then  the  coefficients  fm  fm  ,  °  fm 
are  given  by  Jm  ’  Jm+"  ' ' '  ’  Jn 


fn  =  E  ?n  (My- )  WJ .  (26) 

y'=i 

where  w} . wy  6  E  are  the  Gaussian  weights  corresponding  to  the  nodes  Mi ,  •  •  • ,  My- 

Combining  (25)  and  (26)  yields  an  equation  for  the  entire  filtering  operation: 

J  M 

fm(9i)  =  E  fm(6k)wk  E  (27) 

k~[  j~m 

Equation  (27)  constitutes  a  linear  transformation  from  fm  (9l fm  (0j )  to  fm  (0  \ ) 
f  0 n)i  we  will  refer  to  the  matrix  of  this  transformation  as  the  filtering  matrix  and  will 
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denote  it  by  P .  Using  the  Christoff  el-Darboux  formula  for  the  associated  Legendre  functions 
(see,  for  instance.  [1.  Section  8.9.1]).  which  is 

,v 

(p-p)J^  p”(fi)P';;ui)  =  £"'^ (p'^udp'^h)  -  <28) 

n=\nv, 


where 


C  =  \z\n2  -  m~)/(4n2  -  1).  (29) 

the  filtering  operation  can  be  written  as 


'A’-t-l 


T. 

/  =  1 


rwnwjP’iui,) 

-  Hi 


i=i 


Pi  -  Pi 


(30) 


From  (30)  it  immediately  can  be  seen  that  the  filtering  matrix  consists  of  the  sum  of  two 
matrices  of  the  form  (1).  each  multiplied  on  the  left  and  the  right  by  a  diagonal  matrix. 
Thus,  the  filter  can  be  implemented  using  two  calls  to  an  FMM  for  the  1  /x  kernel:  this  is 
the  method  presented  in  [10]  (from  where  the  above  analysis  is  copied).  It  also  follows  that, 
if  the  generalized  FMM  of  this  paper  is  applied  to  the  filtering  matrix,  the  numerical  ranks 
{ ri }  and  \Pi)  (see  (13)  and  (12))  are  no  more  than  twice  the  corresponding  ranks  when  the 
generalized  FMM  is  applied  to  a  matrix  of  the  form  ( I ).  Thus,  the  filter  can  be  implemented 
efficiently  via  a  single  call  to  the  generalized  FMM. 

Remark  6.1.  If  N  is  larger  than  J .  the  operation  (30)  amounts  to  interpolation  rather 
than  filtering.  If  the  output  nodes  { p. , )  are  the  Legendre  nodes  of  order  /V.  then  the  filtering 
matrix  from  J  nodes  to  N  nodes  is.  except  for  the  multiplication  of  the  input  by  Gaussian 
weights,  the  adjoint  of  the  interpolation  matrix  from  N  nodes  to  J  nodes:  this  can  easily  be 
seen  by  inspection  of  (30).  Thus,  the  matrices  [Ul },  { V, ),  { 7]}.  {A/,},  and  [EjA },  precomputed 
for  the  purpose  of  filtering,  can  also  be  used  for  interpolation  (see  Remark  3.1 ). 


6.1.  General  Nodes 

If  the  nodes  n  \ . fij  are  not  Legendre  nodes,  then  the  coefficients  . f$  cannot 

be  computed  by  direct  use  of  the  formula  (26).  In  this  case,  two  methods  of  performing  the 

filtering  operation  are  available.  First.  Eq.  (24)  can  be  solved  for  the  coefficients  f™ . f'f . 

Alternatively,  the  function  can  be  interpolated  onto  Legendre  nodes,  following  which  the 
filtering  matrix  for  Legendre  nodes  (30)  can  be  used.  We  use  the  second  method  to  show 
that  the  filtering  matrix  for  general  nodes  can  be  compressed  by  the  generalized  FMM;  we 
used  the  first  method  in  our  implementation. 

As  is  well  known  (see.  for  instance.  [1]),  each  of  the  associated  Legendre  functions  P"] 
is  either  a  polynomial  or  a  polynomial  multiplied  by  %/ 1  — .v-,  depending  on  whether  m 
is  even  or  odd.  Thus  the  interpolation  onto  Legendre  nodes  is  a  polynomial  interpolation, 
which,  if/;?  is  odd.  is  preceded  by  a  division  by  >/l  -  .v2  and  followed  by  a  multiplication  by 
vl  --V2.  As  shown  in  [3].  polynomial  interpolation  can  be  performed  in  O(n)  time  using 
an  FMM.  The  filtering  matrix  for  general  nodes  is  the  product  of  the  interpolation  matrix 
and  the  filtering  matrix  for  Legendre  nodes:  since  each  of  these  can  be  compressed  by  a 
generalized  FMM,  their  product  also  can  be  compressed  by  a  generalized  FMM  (see  [2]). 
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Remark  6.2 .  In  the  solution  of  Eq.  (24)  for  the  coefficients  . f™ .  when  m  >  0. 

there  are  more  equations  than  unknowns.  The  definition  of  the  problem  is  such  that  there  is 
an  exact  solution:  however,  numerically,  this  issue  was  dealt  with  by  solving  the  equation 
in  the  least  squares  sense. 

6.2.  Optimizations 

The  above  filtering  algorithm  admits  several  optimizations.  We  describe  them  only  for 

the  case  when  the  nodes  p\ . p j  are  Legendre  nodes:  however,  all  of  them  have  also 

been  implemented  in  the  case  of  general  nodes. 

First,  when  m  is  close  to  N .  the  number  of  coefficients  fj'  to  be  extracted  is  small:  thus 
direct  computation  of  (26)  followed  by  (25)  is  the  most  efficient  algorithm  for  the  filter. 

Second,  portions  of  the  filtering  matrix  have  negligible  norm  and  can  be  discarded.  This 
can  be  easily  seen  by  examination  of  (30).  using  the  fact  that  the  functions  P”?  take  on 
small  values  near  the  endpoints  of  the  interval  [-1.  1].  The  fraction  of  the  matrix  which 
can  be  discarded  increases  with  increasing  m.  to  as  much  as  eight  ninths.  This  optimization 
is  clearly  not  specific  to  the  generalized  FMM;  it  can  be  applied  equally  well  to  the  direct 
method  or  to  the  unaltered  algorithm  of  [10]  and  was  applied  to  the  direct  method  code 
which  was  used  in  the  timings  presented  below. 

Third,  the  filter  can  be  speeded  up  slightly  by  splitting  the  input  function  into  odd  and 
even  parts,  and  filtering  them  separately.  Each  of  the  associated  Legendre  functions  P™  is 
either  odd  or  even,  with  functions  of  successive  degree  n  being  alternately  odd  and  then 
even.  Thus  the  filter,  applied  to  an  odd  function,  yields  an  odd  function  and.  applied  to 
an  even  function,  yields  an  even  function.  This  implies  that  the  filtering  matrix  is  block- 
diagonalized  (into  two  blocks)  by  the  separation  of  odd  functions  from  even  functions.  We 
address  only  the  case  in  which  the  separation  can  be  done  trivially,  that  is.  when  each  of 
the  sets  of  nodes  {pi}  and  { fij }  is  symmetric  around  zero:  for  brevity  of  explanation,  we 
further  assume  that  N  and  J  are  even.  In  this  case  the  separation  of  odd  functions  from  even 
functions  is  accomplished  by  the  usual  formulae 


/odd(A')  =  (/(.v)  ~ /(-.v))/2.  (31) 

/even(A')  =  (/( x)  +  f(~x))/2.  (32) 


where,  as  usual,  each  of  the  functions  /odd  and  /eve n  are  symmetric  around  zero  and.  thus, 
need  only  be  stored  at  half  the  nodes.  It  is  easily  shown,  using  (30)  and  (31),  that  in  the 

case  that  the  nodes  px . p j  are  Legendre  nodes,  each  block  P  =  [ pjj ]  of  the  block- 

diagonalized  filtering  matrix  is  given  by 

A  p^ip^nipow,  -  p’Zip^p^ipowi 
Pij  — 

M,  -  Mi 
fij  +  Mi 

where,  for  the  block  which  filters  even  functions,  the  “±*'  sign  is  an  addition,  and,  for  the 
block  which  filters  odd  functions,  it  is  a  subtraction.  An  inspection  of  (33)  immediately 
shows  that  each  block  is  compressible  by  a  generalized  FMM. 
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Remark  6.3.  Experimentally,  the  ranks  produced  by  the  generalized  FMM  when  applied 
to  the  block-diagonalized  matrix  are  almost  identical  to  the  ranks  produced  when  applied 
to  the  original  filtering  matrix,  except  near  the  point  /r  =  0,  where  the  ranks  are  slightly 
smaller  in  the  block-diagonalized  version. 

Remark  6.4.  Since  the  generalized  FMM  is.  when  applied  to  matrices  of  this  form,  an 
O(n)  procedure,  splitting  the  problem  into  two  problems  of  half  the  size  does  not  produce 
any  asymptotic  improvement  in  execution  time,  although  it  does  produce  an  improvement 
for  small  to  medium-sized  n.  By  contrast,  applying  this  optimization  to  the  direct  method 
(as  was  done  in  the  code  used  in  the  timings  presented  below)  reduces  the  execution  time 
by  a  factor  of  2  asymptotically,  since  the  direct  method  is  0(n2). 

6.3.  Numerical  Results 

Table  III  contains  experimental  results  for  the  filter  for  functions  tabulated  at  Legendre 

nodes.  The  filter  was  run  for  several  values  of  J ,  with  N  =  J / 2  and  for  each  m  —  1 . N: 

the  average  initialization  and  execution  times,  the  average  L *  error,  and  the  average  amount 
of  memory  used  for  precomputed  data  (for  all  values  of  m)  are  tabulated.  The  quantity 
labeled  as  initialization  time  is.  as  before,  the  amount  of  time  taken  to  compute  the  matrices 
which  comprise  the  generalized  FMM;  this  task  only  needs  to  be  performed  once  for  any 
combination  of  J  and  N .  since  the  precomputed  matrices  can  be  stored.  All  figures  were 
produced  by  an  implementation  in  double  precision  (Fortran  REAL*8)  arithmetic  on  a  Sun 
Sparcstation  10.  The  table  also  contains  the  amount  of  time  taken  by  the  direct  method  and 
the  ratio  of  the  execution  time  of  the  FMM-based  filter  to  the  execution  time  of  a  standard 


TABLE  III 

Filter  Timings  for  Points  Tabulated  at  Legendre  Nodes 

Average  time  per  m  (seconds)  for  Averaee  memory 


Ratio: 

Average 

used 

J 

Direct 

FMM  eval 

FMM  init 

eval/FFT 

error  (L:) 

(REAL* 8  spaces) 

Requested 

accuracy  10_? 

64 

0.00014 

0.00021 

0.038 

1.10 

0.872 16E-04 

637 

128 

0.00059 

0.00063 

0.173 

1.73 

0.21 141E-03 

1814 

256 

0.00239 

0.00172 

0.861 

2.25 

0.35270E-03 

4684 

512 

0.00916 

0.00406 

4.528 

1.64 

0.55393E-03 

10586 

1024 

0.15601 

0.00930 

22.708 

1.26 

0.7202  IE-03 

22799 

Requested  accuracy  10~7 

64 

0.00016 

0.00020 

0.035 

1.05 

0.62995E-09 

715 

128 

0.00069 

0.00068 

0.145 

1.84 

0.89805E-08 

2351 

256 

0.00272 

0.00199 

0.749 

2.61 

0.20946E-07 

7074 

512 

0.01015 

0.00545 

4.480 

2.21 

0.35158E-07 

18763 

1024 

0.17623 

0.01351 

25.102 

1.84 

0.5001  IE-07 

45001 

Requested  accuracy  I0’i: 

64 

0.00017 

0.00018 

0.035 

0.97 

0.64733E-13 

712 

128 

0.00078 

0.00070 

0.118 

1.88 

0.36187E-12 

2604 

256 

0.00312 

0.00221 

0.630 

2.90 

0.13528E-12 

8496 

512 

0.01102 

0.00656 

3.752 

2.64 

0.30608E-12 

26072 

1024 

0.19227 

0.01763 

26.347 

2.37 

0. 14238E-I 1 

66714 
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SLATEC  FFT  of  size  J .  The  direct  method  for  which  timings  are  listed  is  a  modestly 
optimized  variant:  the  filtering  matrix  it  used  was  precomputed;  certain  optimizations  used 
for  the  FMM-based  method  were  also  applied  to  it.  as  described  in  Section  6.2. 

The  filter  was  also  implemented  for  functions  tabulated  at  general  nodes  (Section  6.1) 
and  was  tested  on  Chebyshev  nodes.  The  timings  are  almost  identical,  with  the  only  major 
difference  being  that  considerably  more  time  was  required  to  compute  the  filtering  matrix; 
they  are  omitted. 

Remark  6.5.  The  implausibly  large  CPU  times  taken  by  the  direct  method  for  J  —  1 024 
are  the  result  of  the  problem  size  exceeding  the  size  of  the  cache;  on  the  machine  on  which 
timings  were  run.  only  two  double  precision  vectors  of  length  1024  fit  in  the  data  cache. 
Such  a  jump  in  timings  is  not  expected  to  occur  on  most  machines  and,  in  any  case,  could 
be  eliminated  by  use  of  a  blocked  matrix-vector  multiplication  routine. 

Figure  2  is  a  graph  of  the  average  numerical  rank  of  interaction  found  by  the  filter  for 
Legendre  nodes  (the  average  of  the  ranks  {/?;}),  plotted  as  a  function  of  m.  for  J  =  1024  and 
s  —  10~12.  (The  ranks  for  the  filter  for  arbitrary  nodes,  when  applied  to  Chebyshev  nodes. 


FIG.  2.  Average  numerical  rank  of  interaction,  as  a  function  of  m.  for  J  -  1024  and  e  =  10“,:.  The  dashed 
line  is  the  theoretical  bound  on  the  rank. 


608 


yarvin  and  rokhlin 


were  nearly  identical.)  Also  plotted  in  Fig.  2  is  the  theoretical  upper  bound  for  the  average 
rank,  that  is,  twice  the  average  rank  of  an  FMM  for  the  1  /.v  kernel  of  the  same  accuracy. 
Since  most  of  the  ranks  were  close  to  their  average,  the  execution  time  of  the  FMM  is 
roughly  proportional  to  the  average  rank.  (See  Section  3.2.6  for  an  analysis  of  the  case  of 
all  ranks  being  equal;  a  similar  analysis  applies  to  other  variants  of  the  ID  FMM.)  Thus. 
Fig.  2  provides  a  rough  indication  of  the  amount  of  speedup  that  is  obtained  by  switching 
from  the  scheme  of  [10]  to  the  generalized  FMM:  to  a  first  approximation,  if  the  average 
rank  were  equal  to  its  upper  bound  for  all  nu  the  two  schemes  would  be  of  equal  speed; 
to  the  extent  that  it  is  lower,  the  generalized  FMM  is  faster.  (However,  it  should  be  noted 
that  the  generalized  FMM  requires  more  precomputed  data  and  is.  thus,  more  vulnerable  to 
caching  effects.) 


7.  GENERALIZATIONS 

In  this  paper,  we  have  presented  a  scheme  for  the  efficient  filtering  of  functions  on  the 
two-dimensional  sphere.  The  approach  is  based  on  two  observations.  The  first  observation 
is  that  in  the  fast  multipole  method  (see.  for  example.  [3,  6])  potential  kernels  can  be 
replaced  with  functions  from  a  much  more  general  class,  using  the  standard  singular  value 
decomposition,  and  that  this  yields  a  fairly  efficient  implementation.  The  second  observation 
is  that  the  Christoffel-Darboux  formula  (28)  provides  a  straightforward  proof  that  the 
filtering  operator  on  the  sphere  (27)  can  be  compressed  by  FMM-tvpe  techniques.  Both 
observations  admit  far-reaching  generalizations,  outlined  below. 

1 .  The  fast  multipole  method  used  in  this  paper  is  a  special  case  of  an  extremely  general 
procedure.  Particular  versions  of  this  procedure  have  been  used  repeatedly  (see  [1 1.  12]): 
it  is  effective  in  all  situations  when  the  operator  can  be  compressed  by  wavelet  techniques. 
The  following  is  a  brief  outline  of  the  approach. 

Given  a  matrix  to  be  rapidly  applied  to  arbitrary  vectors,  examine  it  (either  analytically 
or  numerically),  identifying  large  submatrices  that  are  of  low  rank.  When  the  coefficients  of 
a  submatrix  are  a  sufficiently  smooth  function  of  its  indices,  such  a  submatrix  is  guaranteed 
to  have  a  low  rank  (this  is  the  environment  where  wavelets  and  wavelet-type  techniques 
can  be  used);  another  frequently  encountered  situation  involves  submatrices  that  are  not 
smooth,  but  are  smooth  matrices  multiplied  by  diagonal  matrices  from  the  left  and/or 
from  the  right  (as  in  the  case  of  the  filtering  operator  (30)).  Any  matrix  whose  rank  is 
much  lower  than  its  dimensionality  is  “compressed*"  by  its  singular  value  decomposition: 
applying  this  procedure  to  a  sufficiently  large  collection  of  submatrices  of  some  matrix,  we 
obtain  a  primitive  “fast"  algorithm  tor  applying  it  to  arbitrary  vectors.  The  scheme  is  further 
accelerated  by  recursive  application  of  this  approach. 

A  strong  argument  can  be  made  that  the  SVD  of  a  matrix  is  its  “optimal"  low-rank 
representation;  in  this  sense.  SVD-based  implementations  of  FMM-type  algorithms  are 
“optimal.*  Indeed,  schemes  have  been  constructed  using  the  SVD  to  further  compress 
multipole  expansions  (see,  for  example.  [3.  9]);  the  resulting  procedures  tend  to  be  more 
efficient  than  the  original  FMM.  In  addition,  the  FMM  for  potential  kernels  has  been  ac¬ 
celerated  (dramatically  so.  in  higher  dimensions)  by  using  diagonal  forms  of  translation 
operators  (see  [7.  15]).  Possible  hybrid  algorithms  combining  the  latter  with  SVD-based 
compression  of  more  general  kernels  are  currently  under  investigation  in  one.  two.  and  three 
dimensions. 
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2.  Formula  (28)  in  the  present  paper  is  a  special  case  of  the  well-known  Christoffel- 
Darboux  formula. 

7>W  •  wo-)  =  ■ ' WJr)  • pAy)  - 041 

Jt=0  *  -V 

where  p;.  are  polynomials  orthogonal  with  some  weight  function  w  on  some  interval.  qk 
is  the  coefficient  at  the  term  xk  in  the  polynomial  pk.  and  n  is  an  arbitrary  positive  in¬ 
teger  (see,  for  example.  [5,  Section  8.902]).  It  is  immediately  clear  from  (34)  that  the 
algorithm  of  this  paper  can  be  used  to  evaluate  rapidly  the  projections  in  spaces  of  poly¬ 
nomials  on  subspaces  consisting  of  polynomials  of  reduced  rank,  in  the  norm  associated 
with  the  weight  w.  There  are  a  number  of  other  projections  that  can  be  evaluated  rapidly- 
using  the  FMM  scheme  of  this  paper,  or  its  variants.  The  operators  we  have  experimented 
with  include  projections  on  subspaces  in  the  space  of  polynomials  in  two  dimensions, 
projections  on  subspaces  spanned  by  appropriately  chosen  Bessel  functions,  and  several 
others.  In  some  cases,  we  have  determined  experimentally  that  the  scheme  works,  but  have 
not  constructed  the  underlying  mathematics.  This  whole  class  of  issues  is  currently  under 
investigation. 
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Abstract.  In  the  first  part  of  the  paper  we  present  an  implementation  of  Milder's  operator 
expansion  formalism  for  acoustic  scattering  from  a  rough  non-periodic  surface.  Our  main 
contribution  to  the  forward-field  calculation  is  the  development  of  two  accurate  ways  of  computing 
the  order-zero  normal  differentiation  operator  No.  The  accuracy  of  our  implementation  is  tested 
numerically.  In  the  second  part  of  our  paper  we  apply  this  approach,  combined  with  a  continuation 
method,  to  an  inverse  scattering  problem.  The  resulting  scheme  performs  significantly  better  than 
the  classical  first-order  methods. 


1.  Introduction 

Scattering  theory  has  been  an  active  area  of  research  for  several  decades.  Several  related 
problems  belong  to  this  field:  acoustic  and  electromagnetic  scattering  form  two  large  classes, 
which  are  further  subdivided  by  assumptions  on  the  underlying  media  and  on  the  boundary 
conditions. 

In  direct  problems  one  wants  to  calculate  the  field  scattered  by  a  given  object.  In  two 
common  situations,  one  knows  either  the  values  of  the  field  on  the  scatterer  (the  Dirichlet 
problem),  or  the  values  of  the  normal  derivative  of  the  field  on  the  boundary  (the  Neumann 
problem).  Direct  problems  are  usually  well  posed. 

Inverse  problems  involve  reconstructing  the  shape  of  a  scatterer  from  the  scattered  field. 
These  problems  are  ill  posed:  the  solution  has  an  unstable  dependence  on  the  input  data. 

For  the  convenience  of  the  reader,  we  shall  outline  the  progress  made  in  acoustic  scattering 
in  a  homogeneous  medium  from  a  sound-soft  obstacle.  A  thorough  discussion  of  this  and 
related  problems  can  be  found  in  the  references  listed  in  the  bibliography.  The  list  of  references 
is  meant  to  be  representative,  rather  than  comprehensive. 

The  sound-soft  scattering  problem  is  characterized  by  the  condition  that  the  total  field 
vanishes  on  the  boundary  of  the  scatterer.  Thus,  acoustic  scattering  is  equivalent  to  the 
Dirichlet  boundary  value  problem  for  the  Helmholtz  operator,  with  the  scattered  field  equal 
to  the  negative  of  the  known  incident  field.  This  problem  is  frequently  solved  by  methods 
of  potential  theory.  The  single-  and  double-layer  potentials  relate  a  charge  density  on  the 
boundary  of  the  scatterer  to  the  limiting  values  of  the  field  and  its  normal  derivative.  The 
resulting  integral  equation  is  then  solved  in  an  appropriate  function  space,  a  common  choice 
being  the  Lebesgue  space  L2. 
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If  the  boundary  is  sufficiently  smooth  (C2.  for  example)  the  method  of  laver  potentials  falls 
within  the  scope  of  Fredholm  theory,  (see  [3]).  When  the  boundary  is  merely  Lipschitz.  the 
Dirichiet  problem  becomes  much  more  difficult  and  was  first  studied  for  the  Laplace  operator, 
corresponding  to  a  zero  wavenumber.  The  boundedness  of  the  double-layer  potential  as  an 
operator  on  L~  is  a  deep  result  in  real-variable  theory,  proved  in  [1]  for  arbitrary  Lipschitz 
constants  (see  also  [2]  for  a  survey  of  related  topics).  Invertibihty  of  the  double-layer  potential 
in  I-  was  first  proved  in  [17].  and  extended  to  other  Lr  spaces  in  [6],  A  thorough  description 
ot  related  research,  together  with  an  extensive  bibliography,  is  given  in  [9J.  Extensions  to 
non-zero  wavenumbers  and  higher  dimensions  are  obtained  and  described  in  [7.  1 1.  14.  15], 

(  [14]  has  an  extensive  bibliography). 

For  the  direct  problem,  a  straightforward  numerical  solution  of  the  integral  equations  for 
the  scattered  field  leads  to  an  0(n2)  algorithm. 

For  the  inverse  problem,  numerical  methods  must  cope  with  the  problem’s  inherent  ill 
posedness.  Some  commonly  used  approaches  require  that  the  scattered  field  can  be  analytically 
continued  across  the  boundary  of  the  seatterer.  which  makes  the  problem  even  more  unstable 

References  [4. 10]  contain  detailed  descriptions  of  these  methods  and  discuss  the  difficulties 
associated  with  them. 

In  this  paper,  we  consider  both  the  direct  and  inverse  problems  of  acoustic  scattering 
in  u  homogeneous  medium.  Following  Milder  [12, 13],  we  start  from  the  boundary'  integral 
equation  fo.  mutation  and  expand  the  scattering  amplitude  in  a  series  of  readily  computable 
Kr™s-  principal  tuol  in  this  formalism  is  the  admittance  operator  relating  the  scattered 
ticld  and  its  normal  derivative  a*,  the  scattering  surface.  See  [18]  for  a  thorough  discussion  of 

op^iuior  expansion  method  and  other  issues  in  rough  surface  scattering. 

We  adapt  Milder’s  theory  to  fast  numerical  evaluation  of  the  field  scattered  from  rough 
•: Lipschitz)  surfaces  with  compact  support.  Other  authors,  see  [8].  have  already  reported 
numerical  implementations  of  Milder’s  theory.  Our  contribution,  in  the  case  of  forward- 
scattering  computations,  is  to  implement  A’0  (the  order-zero  normal  differentiation  operator) 
accurately,  for  the  case  of  a  compact  boundary.  We  resolve  the  problems  caused  by  the 
singularity  of  the  symbol  of  No  as  a  pseudo-difterential  operator  and  that  of  the  associated 
integral  kerne!.  We  also  implement  AT  In  two  dimensions,  the  results  of  our  implementations 
are  compared  with  the  exact  solution  obtained  by  classical  integral -equation  methods.  We  have 
validated  our  method  numerically  for  boundaries  with  Lipschitz  constant  less  than  -L.  In  the 
second  part  of  the  paper,  we  approximate  Ns,  the  inversion-symmetric  form  of  the  admittance 
operator,  by  N0  in  the  forward-field  equation  and  invert  the  resulting  expression  to  solve  an 
inverse  scattering  problem  in  the  far-field  regime.  We  use  a  continuation  method  with  respect 
to  the  frequency:  at  each  step  we  apply  Newton’s  method  with  the  starting  point  given  by  the 
output  from  the  previous  step.  Thus  at  each  stage  we  create  an  approximation  to  the  curve 
uttered  at  a  higher  frequency.  Our  method  recovers  some  nonlinear  effects  not  accounted  for 
by  the  classical  Fourier  inversion  method,  and  works  well  in  some  situations  where  the  linear 
term  approximation  fails  completely. 

The  paper  is  organized  as  follows.  Section  2  introduces  the  notation  used  in  the  paper. 
Section  3  contains  a  detailed  description  of  Milder’s  formalism,  as  well  as  the  algebraic 
transtormations  to  ensure  that  the  relevant  operators  always  act  on  functions  of  compact 
support.  Then  we  describe  two  implementations  of  the  operator  N0  and  compare  them.  The 
section  concludes  with  numerical  results  for  the  forward-field  computations.  We  consider  an 
inverse  scattering  problem  in  section  4  and  discuss  our  continuation  method  for  solving  it.  This 
section  also  includes  some  numerical  experiments  in  surface  reconstruction.  We  conclude  with 
a  summary  in  section  5. 
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2.  Notation  and  definitions 


We  shall  associate  with  the  vector  X  =  (*[,  _r2.  *3)  e  R\  the  vector  X  =  .v2.  —*3).  x 

without  subscripts  will  denote  a  vector  in  l2  and  we  shall  sometimes  write  X  as  (x.  jc3).  Our 
scattering  surface  is  denoted  by  T  and  is  given  by  the  graph  of  a  compactly  supported  Lipschitz 
function  £  :  R2  — R.  The  points  on  the  surface  are  thus  of  the  form  (,v .  C{x)).  The  free-space 
Green’s  function  G(X.  T)  for  the  wavenumber  k  is  given  by  the  formula 


G(X,  Y)  = 


1  exp[iit|X  -  Y\] 
4,t  |X  -  Y | 


(1) 


for  X  ±  Y. 

We  shall  frequently  denote  G(X,  Y)  by  Gx(Y).  We  shall  also  use  the  following  expression 
for  G: 


G{(x,  z),  (*0,  -0))  =  g(l(x,  z)  -  (x0,  Zo)|) 
where  (x,  z)  ^  (*o,  Zo)  and 


~i  kr 


1  e1' 
g(r)  =  - - . 

47T  r 


Functions  satisfying  the  Helmholtz  equation 


will  be  called  metaharmonic. 


(2) 


3.  Computation  of  the  scattered  field 

We  consider  the  Dirichlet  problem  for  acoustic  scattering  from  a  compactly  supported 
perturbation  of  the  plane.  In  subsection  3.1,  we  describe  Milder’s  operator  expansion 
formalism.  We  also  discuss  a  modification  we  make  to  ensure  that  all  integrations  are 
performed  over  compact  regions.  The  next  two  subsections  (3.2  and  3.3)  form  the  main 
part  of  our  contribution  to  the  forward-scattering  computations:  two  implementations  of  the 
order-zero  normal  differentiation  operator  No.  Because  of  the  centra!  role  No  plays  in  the 
expansion  formalism,  we  feel  it  is  of  interest  to  describe  different  ways  of  implementing  it. 
In  subsection  3.4,  we  compare  the  two  methods.  The  last  subsection  (3.5)  presents  some 
numerical  examples  of  computations  of  the  scattered  field. 


3.1.  The  operator  expansion  formalism 


The  surface  T  of  the  scatterer  is  given  by  the  graph  of  a  compactly  supported  Lipschitz  function 
C  :  R2  R.  We  consider  the  Dirichlet  problem  for  the  Helmholtz  equation,  i.e.  we  wish  to 
solve 


( A  +  k  )  <I> scat  —  0  (4) 

in  the  region  lying  above  V,  with  the  sound-soft  boundary  condition 

^scatlr  =  —  ^inclr  (5) 

where  <l>inc  is  the  (known)  incoming  wave  and  <l>sca,  is  the  scattered  wave. 

Following  Milder,  see  [12,  13],  we  begin  with  the  Green-Helmholtz  integral  for  the 
scattered  field: 


^scatf^) 


-a 


9G, 

dn 


-(X)<I>scat(X) 


a<D, 


dn 


;WG|i(I)  d  s(X) 


/ 


where  the  free-space  Green’s  function  is  defined  by 

exp[i/:|X  -  R\] 


Gr{X)  = 


4x\X  —  R\ 


(6) 

(7) 
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Milder  has  modified  this  formula  to  obtain 


=  2  j  GR(y.  C(v))(Ars<J>inc)(v)dv  (8) 

where  Ns  has  a  formal  operator  power  series  expansion  in  £ .  Only  even  powers  of  £  occur  in 
the  expansion,  and  Ns  can  be  written  as  a  series  of  operators 

N s  =  AA/  =  Afo  +  2  +  •  •  * .  (9) 

j= o 

Already,  the  first  two  terms  of  this  expansion  provide  an  order-four  approximation  to  the 
scattered  potential,  which  surpasses  the  classical  ones  of  Bragg  or  Kirchhoff  (see  [12]).  The 
expressions  for  the  operators  Nq  and  AA  are  given  by  the  following  formulae: 


Nof  =  (i-A2  -  l>?l2/(>?)) 
U2f  =  [?.  N0]]N0f 


where 


=  S(N0g)  -  N0(!g)  (i2) 

f  is  the  Fourier  transform  and  f  is  the  inverse  Fourier  transform  of  f . 

Higher-order  terms  have  simple  expressions  in  terms  of  higher-order  commutators, 
although  their  implementation  gradually  becomes  more  difficult. 

Alternatively,  N0  can  be  viewed  as  a  convolution  operator  with  kernel  K(x,  y)  given  by 


where 


K(x,  y)  =  -2- 


J  eikr 

g(r )  =  - - • 

An  r 


Note,  that  the  kernel  K(x.  y)  is  singular  and  is  not  a  rapidly  decaying  function  of  \x  —  y|.  Any 
accurate  numerical  implementation  has  to  overcome  these  problems. 

In  our  experiments  the  incident  field  originates  at  a  point  source  located  at  S,  so  that 

't’incCF)  =  Gs(F).  (15^ 

We  calculate  the  scattered  field  <t>scat(/?)  using  N0  or  N0  +  N2  instead  of  Ns.  The  resulting 
approximations  are  correct  through  second  and  fourth  order  in  respectively.  However,  one 
cannot  use  formula  (8)  directly,  since  the  functions  A0<I>,nc,  (N0  +  AS)<Dinc  and  GR(y,  f  (y)) 
are  supported  on  the  whole  plane.  Therefore,  we  modify  formula  (8)  so  that  all  non-local 
operators  are  applied  to  compactly  supported  functions  and  the  final  integration  is  performed 

on  a  compact  set.  First,  since  G^(y)  is  metaharmonic  above  the  boundary,  (8)  applied  to  G?(  v) 
gives:  * 

G-s{R)  =  -lj  GR(y,  l;(y))NsG j-(y)  dy  (16) 

where  S  is  the  reflection  of  S  across  the  XT-plane.  Combining  (15),  (16)  with  (8),  we  obtain 

$scat(#)  =  ~G-S(R)  +  lj  GR(y,  Z(y))Ns(Gs  -  Gj)(y)dy.  (17) 

Note  that  the  difference  Gs-Gs  vanishes  outside  the  support  of  f . 
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Even  though  Gs  -  G-s  is  compactly  supported.  NS(GS  -G j).  in  general,  is  not.  We  shall 
now  describe  the  additional  modifications  that  are  made  to  (17)  after  A7S  is  replaced  by  Ar0.  to 
ensure  integration  over  a  compact  set.  Defining 

=  -GS(R)  +  2  J  GR{y.U>))N0(Gs  -  G-S)(y)  dy  (18) 

we  have 

< a,  W  =  +  lj  GR(y.  0)N0(Gs  -  Gs)(y)  dy 

+2  J (GR(y,  CO*))  -  Gs(v,  0))A7o(Gs  -  G5)(y)  dy.  (19) 

Since  is  a  symmetric  operator,  and 

8Gb 

NoG/t(y)  =  M)Gj?(y)  =  — — (y,  0)  (20) 

O  y? 

we  immediately  obtain 

Oscar  w  =  -Gj(tf)  +  2  J  ^(y,  0)(GS  -  G$)(y)  dy 

+2  J{GR(y,  S(y))~  GR(y,0))N0(Gs-Gs)(y)dy.  (21) 

Since  both  GR(y,  f(y))  —  G/?(y,0)  and  dG^/dyy  are  compactly  supported,  we  see  that 
the  evaluation  of  4>sCat(7?)  can  be  reduced  to  evaluation  of  inner  products  of  the  form 
(Nof.  g)  =  J  No  f(y)g(y)  dy,  where  both  /  and  g  are  compactly  supported. 

The  operator  N2  requires  several  similar  decompositions  starting  from  (17).  We  omit  the 
details. 

3.2.  Implementation  of  the  operator  No 

As  shown  in  the  previous  subsection,  computation  of  the  approximate  scattered  field  can  be 
reduced  to  evaluation  of  inner  products  of  the  form  (N0f,  g),  where  both  /  and  g  are  compactly 
supported. 

A  straightforward  numerical  implementation  of  N0  would  consist  of  approximating  the 
Fourier  integral  by  a  DFT,  multiplying  by  the  symbol  of  N0,  and  then  applying  an  approximate 
inverse  Fourier  transform  via  another  DFT.  However,  the  symbol  of  N0  as  a  pseudo-differential 
operator,  i^k2  —  |r?2|,  is  not  differentiable  on  the  circle  |^|  =  k.  Therefore,  this  direct  approach 
would  result  in  a  low-order  integration  scheme  and  require  a  very  fine  uniform  discretization 
in  frequency  to  give  accurate  results. 

In  this  subsection,  we  demonstrate  one  way  of  resolving  this  problem.  Our  approach  can 
be  applied  to  compute  other  Fourier  integral  operators  with  singular  kernels.  In  our  numerical 
experiments,  we  approximate  Lipschitz  curves  and  surfaces  by  smooth  functions.  Thus  the 
function  /  (and  g)  is  smooth  in  addition  to  being  compactly  supported.  Therefore,  the  function 
/  is  numerically  compactly  supported  and  integrations  involving  products  of  /  are  effectively 
on  compact  subsets  of  the  frequency  space. 

Our  method  of  computing  ((Vo/,  g)  involves  expressing  No  as  a  sum  of  two  operators,  T\ 
and  7?,  with  the  following  properties: 

•  the  symbol  of  T\  is  continuously  differentiable  to  a  prescribed  order,  and 

•  T2  is  a  convolution  with  a  smooth  function. 
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We  evaluate  T\  using  the  FFT  on  the  frequency  side.  Since  the  symbol  of  T\  is  several 
times  differentiable,  it  Can  be  sampled  relatively  coarsely  and  still  yield  a  good  approximation. 

The  convolution  with  the  smooth  kernel  of  T2  can  be  implemented  efficiently  by  an  FFT. 
where  this  time  the  FFT  is  not  viewed  as  a  discretization  of  the  continuous  Fourier  transform, 
as  it  was  when  evaluating  T\,  but  as  an  algebraic  operation  which  diagonalizes  the  discrete 
convolution.  ( Nof ,  g)  is  then  evaluated  by  integration  over  the  compact  support  of  g. 

We  shall  exhibit  the  decomposition  of  N0  in  three  dimensions,  the  result  being  valid  in 
two  dimensions  with  only  minor  modifications. 

We  note  (see  [13]),  that 

Nof  CO  =  J  i#(OeIT  ',/(r?)  dr?  (22) 

where  q (r?)  =  -  jrjp  is  chosen  to  have  a  positive  imaginary  part  when  |rj|2  >  k2. 

We  fix  a  positive  integer  m  and  a  positive  real  x3.  We  decompose  N0f  into  two  terms: 
Nof(x)  =  hf(x)  +  T2f(x ) 

=  f  iqmi  -  e^Te^/Wdr? 

+  (2^  L  iqmi  ~  fl  "  e,9(")x,r  }e^/(0 dr?-  (23) 

Let  us  first  look  at  7j.  Its  symbol,  a{T\),  is  given  by 


a(Ti)  =  iq(rj)[l  -  ei«(")^]m 

=  i q(n)  £- 


■  ,  ,  ,  q2(r])x 3 2 

iq(r))x3  + - - - + 


-I 


=  ci<?m+l  (r?)  +  c2^m+2(r?)  +  . 


If  m  is  odd,  then  m  +  1  is  even,  and  <7m+l(r?)  is  a  polynomial.  Now,  for  j  =  1,2, 


and 


^(r,)  =  A(jt2_M2)./2  =  i^ 
d  *lj  d  r\j  q(n) 


d  r]j 


ql{rj)  =  cq1  2{ri)r)j. 


(24) 

(25) 

(26) 


Thus,  each  derivative  in  r?  reduces  the  exponent  of  q  by  two.  If  /  =  2j  +  1,  then  q'(q)  is  j 
times  continuously  differentiable.  In  the  above,  if  m  =  2n  +  1,  m  +  2  =  2(n  +  1)  +  1,  then 
ct(Ti),  the  symbol  of  T\,  is  n  +  1  times  continuously  differentiable. 

As  for  the  operator  T2,  we  write 

Ti{f){x)  =  f  K(x  -  y)f(y)dy. 

J  m 


One  can  show  that 


K{x)  =  p-ir^y^nx3) 


(27) 

(28) 


where 
h(k,x,x  3)  =  -2- 


exp[i kjx2  +  xl]  r 

=  “2 - r— ■  |  ik(x2  +x2)~i/2  -  (k2x2+  l)(x2  + 

4jtyjx2+X2  ' 


x\ [r1 


-y\kx](x2 + x2) 3/2 +  3 x2(x2+x%r2  ‘- 


(29) 
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Moreover.  h(k.  x.  xy)  is  a  smooth  function  of  x  for  a  positive  and  thus  AT(.v)  is  also  smooth. 

Details  of  the  derivation  are  given  in  the  appendix. 


3.3.  An  alternative  implementation  of  the  operator  Nq 

There  is  an  alternative  way  of  implementing  the  operator  N0.  We  can  regard  N0  as  a  convolution 
with  an  integral  kernel,  which  has  a  singularity  at  zero.  This  section  sketches  the  details  of  this 
approach.  The  interested  reader  may  see  [16]  for  a  thorough  discussion  of  the  relevant  issues. 
In  the  following  we  derive  an  explicit  expression  for  the  kernel. 

The  Green’s  function  for  the  upper  half-space  G(;>0|  can  be  expressed  in  terms  of  the 
free-space  Green’s  function  G  as  follows, 

G|z>o)((*,  z),  Uo,  zo))  =  G((x,  z ),  (*o,  Zo))  ~  G((x.  -z),  (x0,  zo ))•  (30) 

The  Poisson  kernel  p  for  the  upper  half-space  is  the  outward  normal  derivative  of  the  Green’s 
function 


P(x .  (*o.  zo))  =  G[Z>0]((x,  z),  (x0,  zo)) 

dz 

=  2g'(|(x,0)  -  (jc0,  zo) I) i 


:=0 

Zo 


IU-,  0)  -  (A-o,  z0)| 

The  Dirichlet-to-Neumann  operator  N0  can  be  expressed  by  the  formula 

No  fix)  =  Hm  ~  f  p(y,  (jc,  z))/(  y)  dv. 

0  dz  J R2 


(31) 


(32) 


The  kernel  K(x,y)  of  the  Dirichlet-to-Neumann  operator  Nq,  for  x  f  y,  is  therefore  the 
outward  normal  derivative  of  the  Poisson  kernel  p  (see  also  [18]), 


K{x,y)  =  ~—p(y,(x,z)) 
dz 


'g'Ux-yD 


=  -2- 
I=o  \x-y\ 


(33) 


The  operator  No  has  been  implemented  via  the  following  approximation 
No  fix)  %  Trapezoidal  sum  for  j  K(x,  y)f(y)dy 

+c}f(x)h~l  +c2Af(x)h  +c3f(x)k2h  +0(/i3) 


(34) 


where  A  is  the  Laplace  operator  in  R2  and  h  is  the  side-length  of  an  elementary  grid  square. 
The  constants  ct ,  c2,  c3  can  be  computed  numerically  from  the  formula  (34)  using  Richardson 
extrapolation,  see  [5],  p  269. 

A  similar  approach  applies  to  the  two-dimensional  case.  The  free-space  Green’s  function 
is  then  given  by  the  formula 


p(r)  =  -Ho(kr) 
4 


and  the  kernel  of  Aro  is  equal  to 

K(x,y)  =  -  2^1 


We  use  the  following  approximation: 


i k  H^(k\x  -  >’|) 
2  \x-y\ 


N0f(x)  ss  Trapezoidal  sum  for  the 


/ 


K{x,  y)f(y)dy  +  a](h)f(x)+a2(h)f,\x) 


(35) 


(36) 


(37) 
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where 


..  n  1 

3  h  2  7t 


E - +  loe 

i 


02(h)  =  f  +  flLh:'k: 

2n  (2;r)? 


Ilk 

4rr 


and  E  =  0.577  215  ...  is  the  Euler  constant. 


(38) 


3  A.  Comparison  of  the  two  methods 

We  have  described  two  different  methods  of  implementing  N0.  The  first  one.  expressing  N0 
as  a  sum  of  T,  and  T2.  seems  to  be  rather  general  and  may  prove  useful  for  other  integral 
operators.  The  main  idea  is  that  a  non-decaying,  singular  symbol  is  broken  into  two  parts:  the 
first  is  non-decaying  but  smooth,  while  the  second  is  singular  but  rapidly  decaving  at  infinity. 
The  first  part  can  be  applied  on  the  frequency  side  with  a  relatively  coarse  discretization  to 
functions  with  a  fast  decaying  Fourier  transform.  Thus  we  can  accurately  evaluate  T\f  when 
/  is  smooth.  The  second  symbol  is  not  applied  on  the  frequency  side,  but  as  a  convolution 
operator  on  the  space  side.  Since  this  symbol  is  rapidly  decaying,  the  convolution  kernel  is 
smooth  and,  again,  a  relatively  coarse  discretization  can  be  used.  Thus  we  can  accurately 
evaluate  T2f  when  /  is  compactly  supported. 

The  second  method  of  implementing  N0  illustrates  how  to  calculate  a  convolution  with 
a  kernel  having  a  singularity  at  0  numerically.  The  method  is  more  direct,  but  the  correction 
coefficients  have  to  be  computed  for  each  particular  kernel. 


3.5.  Numerical  results 

In  this  subsection  we  present  examples  of  numerical  computations  of  approximate  scattered 
fields.  We  report  our  results  in  two  dimensions  and  compare  them  with  the  accurate  values 
obtained  using  the  classical  integral-equation  approach.  We  used  the  two-dimensional  version 
o  formula  (18)  to  calculate  <I>scal(/f),  and  a  similar  expression  when  Ns  is  replaced  by  Nq  +  Ni. 
The  results  have  been  obtained  with  N0  implemented  by  the  method  described  in  section  3.2k 
after  verifying  that  both  methods  give  nearly  identical  results  in  test  cases. 

The  integral-equation  method  requires,  however,  that  the  scatterer  be  bounded.  When  the 
scatterer  is  defined  by  a  non-negative,  compactly  supported  function  f .  it  is  possible  to  reduce 
the  Dinchlet  problem  on  the  open  domain  above  f  to  the  Dirichlet  problem  for  the  exterior  of 
a  bounded  region.  To  this  end.  we  first  construct  a  solution  u  to  the  Dirichlet  problem  for  the 
upper  half-space.  The  boundary  values  of  u  should  match  the  given  data  away  from  the  support 
of  the  curve  and  can  be  chosen  arbitrarily  on  the  support.  Next 'we  consider  the  lens-shaped 
region  formed  by  reflecting  ?  about  the  plane  ;  =  0,  and  the  antisymmetric  Dirichlet  boundary 
conditions  given  as  follows:  the  boundary  values  on  the  upper  half  of  the  region  are  equal  to  the 
original  ones  minus  the  values  of  u  on  the  curve,  while  the  boundary  values  on  the  lower  half 
are  the  negatives  of  the  corresponding  values  on  the  upper  half.  We  now  solve  the  Dirichlet 
problem  for  the  resulting  symmetric  domain  with  antisymmetric  boundary  values.  Note  that 
the  solution  vanishes  everywhere  on  the  planed  =  0  outside  the  bounded  region.  The  sum  of 
u  and  the  solution  for  the  symmetric  region  is  the  solution  to  the  original  problem. 

Tables  1-3  present  results  of  numerical  simulations  for  a  simple  test  curve.  In  all  cases, 
the  relative  errors  are  computed  for  the  reduced  potential  <t>  =  <t>scat  +  G-S{R)  Usin°  the  full 
potential,  the  relative  errors  are  much  smaller,  but  less  meaningful.  The  errors  are  computed 
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Table  1.  Relative  error  of  the  reduced  potential  with  Ns  %  A;o. 


Height 


Wavenumber 

1 

0.5 

0.25 

0.125 

0.0625 

7 r 

6.72  x  10-' 

1.74  x  10-' 

4.77  x  10~2 

1.27  x  10~2 

3.27  x  10"3 

2,7 

8.10  x  10-' 

3.24  x  10-' 

8.56  x  I0"2 

2.20  x  10~2 

5.60  x  10"3 

47 

9.52  x  10-' 

3.92  x  10"' 

7.74  x  10~2 

1.85  x  10~2 

4.66  X  10'3 

87 

1.13  x  10° 

5.19  x  10~' 

9.43  x  10~2 

2.16  x  I0-2 

5.05  x  10"3 

167 

1.24  x  10° 

4.82  x  10"1 

8.64  x  10~: 

2.21  x  10~2 

5.37  x  10"3 

327 

1.30  x  10° 

5.68  x  10-' 

8.34  x  10~2 

2.06  x  10~2 

5.49  x  10~3 

Table  2.  Relative  error  of  the  reduced  potential  with  Ns  ^  No  +  AS. 


Wavenumber 

Height 

1 

0.5 

0.25 

0.125 

0.0625 

7 

2.82 

x 

10"1 

2.21  x  10“2 

1.84  x 

10‘3 

1 .34  x 

10~4 

2.44  x 

10"5 

27 

3.81 

X 

10-1 

2.10  X  10-2 

1.76  x 

10~3 

1.25  x 

10‘4 

3.28  x 

10~5 

47 

1.06 

X 

10° 

9.09  x  10-2 

ui 

d> 

x 

10~3 

3.72  x 

10-4 

5.32  x 

10~5 

87 

7.81 

X 

10“' 

2.21  x  10~‘ 

9.81  x 

10~3 

4.18  x 

10“4 

7.59  x 

lO"5 

167 

1.04 

X 

10H 

3.64  x  10"' 

9.18  x 

10’3 

4.47  x 

10-4 

2.15  x 

10“4 

327 

1.12 

X 

10° 

5.22  x  10“' 

7.98  x 

10"3 

5.09  x 

IO"4 

6.76  x 

10~4 

Table  3.  Relative  difference  of  the  reduced  potentials  with  Ns  % 

jVo  and  Ns  ~ 

N0  +  AS . 

Wavenumber 

Height 

1 

0,’5 

0.25 

0.125 

0.0625 

7 

8.59 

x 

10"1 

1.95 

X 

10"1 

4.94  x 

10“2 

1.28  x 

10‘2 

3.28  x 

10"3 

27 

8.68 

X 

10"1 

3.38 

X 

10"1 

8.69  x 

10"2 

2.21  x 

10~2 

5.62  x 

10-3 

47 

9.86 

X 

10-' 

4.52 

X 

io-1 

8.21  x 

IO'2 

1.88  x 

10“2 

4.68  x 

10“3 

87 

1.03 

X 

10° 

5.80 

X 

lO'1 

1.03  x 

IO'1 

2.20  x 

10”2 

5.07  x 

IO"3 

167 

9.81 

X 

10-' 

6.54 

X 

io-1 

9.42  x 

10"2 

2.25  x 

10“2 

5.39  x 

10"3 

327 

1.02 

X 

10° 

7.70 

X 

10"1 

9.04  x 

10’2 

2.09  x 

IO"2 

5.48  x 

10-3 

in  the  / 2  norm: 


(39) 


where  <t>,  is  the  reduced  potential  at  the  ith  receiver  obtained  by  the  algorithm  and  <J>,  is  the 
corresponding  value  obtained  by  solving  the  combined  field  integral  equations  directly  (see  [4], 
p  67,  for  a  thorough  description). 

Note  how  the  relative  errors  increase  with  the  height  of  the  curve,  but  that  they  remain 
nearly  constant  at  a  fixed  height  as  the  wavenumber  increases. 

Table  4  records  the  result  of  a  scattering  experiment  performed  for  a  curve  having  only 
low-frequency  components.  The  objective  was  to  determine  the  dependence  of  the  term  Ni 
on  the  wavenumber  of  the  incident  field.  We  find  that  the  error  depends  only  weakly  on  the 
wavenumber  of  the  incident  field  once  it  exceeds  the  highest  frequency  of  the  curve. 
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Table  1.  Relative  error  of  the  reduced  potential  with  A\  =s  A'„. 


Height 


Wavenumber 

1 

0.5 

0.25 

0.125 

0.0625 

77 

6.72  x  10" 1 

1.74  x  10-' 

4.77  x  I0“2 

1.27  x  10’: 

3.27  x  10^ 

2tt 

8.10  x  I0"1 

3.24  x  10"‘ 

8.56  x  10": 

2.20  x  10~: 

5.60  x  lO*2 

4  7T 

9.52  x  10-' 

3.92  x  10"' 

7.74  x  10~2 

1.85  x  10-: 

4.66  x  10~3 

&7T 

1.13  x  10° 

5.19  x  10-' 

9.43  x  10~2 

2.16  x  10_: 

5.05  x  10"3 

\6tt 

1.24  x  10° 

4.82  x  10-' 

8.64  x  10-; 

2.21  x  lO"2 

5.37  x  I0~3 

32;r 

1.30  x  10° 

5.68  x  10-' 

8.34  x  10"2 

2.06  x  10‘2 

4* 

X 

o 

l 

Table  2.  Relative  error  of  the  reduced  potential  with  As  A0  +  N2. 

Wavenumber 

Height 

1 

0.5 

0.25 

0.125 

0.0625 

7X 

2.82  x  10-' 

2.21  x  10"2 

1.84  x  10-J 

'jJ 

4- 

X 

o 

1 

2.44  x  I0"5 

2tt 

3.81  x  10-‘ 

2.10  x  10-2 

1.76  x  I0~3 

1.25  x  10-* 

3.28  x  10'5 

4  71 

1.06  x  10° 

9.09  x  10-2 

5.67  x  lO”3 

3.72  x  10"4 

5.32  x  10-5 

%7T 

7.81  x  lO"1 

2.21  x  lO'1 

9.81  x  I0"3 

4.18  x  lO"4 

7.59  x  10"5 

167 T 

1.04  x  10° 

3.64  x  lO”1 

9.18  x  10"3 

4.47  x  10"4 

2.15  x  10"4 

32;r 

1.12  x  I0n 

5.22  x  lO'1 

7.98  x  10"3 

5.09  x  10"4 

6.76  x  I0-4 

Table  3.  Relative  difference  of  the  reduced  potentials  with  As  % 

7V()  and  Ns  ~  A^0  +  AS . 

Height 

Wavenumber 

1 

0.5 

0.25 

0.125 

0.0625 

71 

8.59  x  lO"1 

1.95  x  10-' 

4.94  x  10~2 

1.28  x  10~2 

3.28  x  lO"3 

2  7T 

8.68  x  lO’1 

3.38  x  lO'1 

8.69  x  lO"2 

2.21  x  10“2 

5.62  x  I0"3 

47 T 

9.86  x  10-' 

4.52  x  lO'1 

8.21  x  lO"2 

1.88  x  I0'2 

4.68  x  10"3 

87T 

1.03  x  10° 

5.80  x  lO-1 

i 

o 

X 

O 

2.20  x  lO*2 

5.07  x  10“3 

I6tt 

9.81  x  10"1 

6.54  x  lO"1 

9.42  x  10~2 

2.25  x  lO"2 

5.39  x  10-3 

32tt 

1.02  x  10° 

7.70  x  lO-1 

9.04  x  10"2 

2.09  x  10"2 

Ln 

E 

oo 

X 

o 

1 

in  the  l2  norm: 


(E,  |4>/  -  4>,  l2)1/2 

E~  /v-  '  -A 1/2  (39) 

where  <t>,  is  the  reduced  potential  at  the  /th  receiver  obtained  by  the  algorithm  and  <J>,  is  the 
corresponding  value  obtained  by  solving  the  combined  field  integral  equations  directly  (see  [4], 
p  67,  for  a  thorough  description). 

Note  how  the  relative  errors  increase  with  the  height  of  the  curve,  but  that  they  remain 
nearly  constant  at  a  fixed  height  as  the  wavenumber  increases. 

Table  4  records  the  result  of  a  scattering  experiment  performed  for  a  curve  having  only 
low-frequency  components.  The  objective  was  to  determine  the  dependence  of  the  term  N-> 
on  the  wavenumber  of  the  incident  field.  We  find  that  the  error  depends  only  weakly  on  the 
wavenumber  of  the  incident  field  once  it  exceeds  the  highest  frequency  of  the  curve. 
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From  (43)  we  find  that 

e‘*r  /  l  \ 

Gs(}\  ?(>•))  -  Gs()\  CO'))  =  -i-^exp[-ik(ol.o2)  •  v]  sin  (kcrtf(y))  +  O  (  ^  I  .  (44) 

Similarly, 


eitr  /  1  \ 

Gr()\  C(y))  -  GR{y,  0)  =  - —  exp[-i/r(wi,  to2)  ■  y]  (e_,*<U3f(-v)  -  l)  +0  (  —  )  . 

4 Ttr  \r- } 


Moreover, 


and  therefore 


ekr  (  1 

Gr(.\,  yi)  =  —  exp[— i&a>  •  (y,  y3>]  +  O  I  — 


— — (v,  0)  =  i ka>3- — exp[— i^(&)t ,  co->)  •  v]  +  0  — 
9>'3  4  nr  '  \r~ 


Combining  (44),  (45),  (47)  with  (40),  we  obtain 


<l>scat(^)  ~  —G${R)  +  k(D 3— — - 


exp[— \k(co\  +  a] ,  o>2  +  <72)  •  v]  sin  (£03 £)  d y 


g2i/:r  /» 

-ij- —  /  exp[-i*(wi,  a>2)  •  v]  (e-^  -  l) 
47TT-  yE: 

xA0  (exp[-i/:(ai ,  of)  •  y]  sin  (kayO)  dy  +  O  ^ 
This  leads  to  an  expression  in  terms  of  the  Fourier  coefficients 


^scatW  %  -Gs(R)  +  km  -  r  r  [sin  (/:a3f)]A  (^1  +  ^1,  ka»  +  kai)) 

'  4n2r2 

t2ikr 

-i  .  2  2  -  l)  No  (exp[— i^(cri,  a2)  •  y]  sin  (&CT3C))]a  {ku>\ ,  kurf) 


In  the  special  case,  when  the  source  is  directly  above,  this  formula  becomes 

t2\kr 

$scat(7?)  - G-S{R )  +  kco 3  —■—■■■  [sin  (/:C)]A  (&o>! ,  &a>2) 

47r-r- 

->4^2  [(e-i*^f  -  1)^0 (sin (kO)]\ka)Ukco2)  +  o(-\. 
Similarly,  for  the  two-dimensional  case,  one  can  derive  the  following  formula: 

e2i  kr 

^scat {R)  %  —G-S(R)  +  ict>3- —  [sin  (k£)]A  (kcoi) 

2  nr 

„2\kr  /  1  \ 

+T-J-  [(e-i*‘03f  -  1)  N0  (sin  (*0)f  +  O  (  -  ) . 


Although  we  used  expression  (5 1 )  in  our  numerical  experiments,  we  would  like  to  mention 
the  following  formula  because  of  its  appealing  simplicity.  For  small  elevations  kl ’,  the  sines 
and  the  exponentials  can  be  expanded  in  powers  of  their  arguments,  yielding 

e2i*r  /  1  \ 

<J>sca. W  %  -GS(R)  +  ikco3—  (C  -  {kco\ )  +  O  (  —  J  .  (52) 
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A  similar  result  holds  in  three  dimensions. 

Let  us  now  describe  the  geometric  setup  in  two  dimensions.  The  function  c  is  supported 
on  the  interval  [-1.  1],  The  receivers  at  which  we  measure  the  scattered  field  are  located  on  a 
semicircle  of  radius  1(P  in  such  a  way  that  their  projections  on  the  .v-axis  are  equispaced.  The 
number  of  receivers  is  [2k /n\.  The  source  is  located  at  the  point  (0.  105). 

Our  reconstruction  of  f  proceeds  as  follows. 

•  Step  0.  We  set  the  initial  approximation  to  zero. 

•  Step  1  We  choose  an  initial  value  for  the  wavenumber  k  and  seek  an  approximation  to 
the  function  f  by  a  trigonometric  polynomial  of  degree  not  exceeding  k.  Substituting 

f  =  52  Cn&m'  (53) 

n=~k 

in  (51),  we  solve  for  the  coefficients  cn  using  Newton's  method  with  the  previous 
approximation  as  the  starting  point.  The  resulting  solution  represents  the  Fourier 
coefficients  of  f  corresponding  to  the  frequencies  not  exceeding  k. 

•  Step  2.  We  increase  k  to  a  new  value  k'  (k'  =  2k  is  a  convenient  choice).  We  repeat 
step  1  with  the  previous  approximation  to  f  as  our  starting  point.  More  precisely,  we 
approximate  f  by  the  Fourier  series  £*!-*■  cnc'nl  and  determine  the  coefficients  c„  by 
so  \  ing  (51)  using  Newton  s  method  starting  from  the  previous  result! 

,  _  |  c„  for  MO 

jo  for  |/?|  >  £ 

where  the  coefficients  c„  come  from  step  1 . 

We  now  iterate  step  1  and  step  2  until  we  reach  a  prescribed  frequency  k0.  For  a  complete 
reconstruction  we  need  to  choose  k0  larger  than  the  highest  frequency  of  the  curve 

We  have  observed  experimentally  that  the  continuation  method  described  above  convenes 
for  a  larger  class  of  surfaces  than  Newton’s  method  starting  at  f  =  0. 


Figure  1.  Reconstructions  of  the  curve  filtered  at  it  =  n 
reconstruction  -  -  -  first-order  reconstruction . 


Filtered  curve - ;  second-order 
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4.2.  Numerical  results 

Figures  1-6  illustrate  the  continuation  method  as  described  in  the  previous  subsection.  The 
solid  curve  in  the  final  figure  is  the  unknown  curve  to  be  reconstructed.  The  first  figure 
shows  a  filtered  version  of  that  curve  at  wavenumber  n.  and  the  reconstruction  carried  out 
using  Newton’s  method  starting  from  the  zero  curve.  The  second-order  reconstruction  is 
plotted  together  with  the  ‘classical’  linear  reconstruction.  The  output  of  the  second-order 
reconstruction  is  then  the  starting  point  for  the  next  stage,  where  the  wavenumber  doubles  (and 
so  does  the  number  of  receivers  on  the  semicircle).  We  proceed  successively,  as  outlined  in 
section  4.1,  until  we  reach  the  wavenumber  that  is  above  the  highest  frequency  of  the  curve.  At 
each  stage  we  attempt  to  reconstruct  the  true  curve  filtered  at  the  corresponding  wavenumber. 
The  final  reconstruction  using  the  second-order  method  with  continuation  approximates  the 


Figure  2.  Reconstructions  of  the  curve  filtered  at  k  =  2tt.  Filtered  curve _ :  second-order 

reconstruction - ;  first-order  reconstruction . 


Figure  3.  Reconstructions  of  the  curve  filtered  at  k  —  4tt.  Filtered  curve - ;  second-order 

reconstruction - ;  first-order  reconstruction . 
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Figure  4.  Reconstructions  of  the  curve  filtered  at  k  =  8tt.  Filtered  curve - :  second-order 

reconstruction  -  -  -  first-order  reconstruction . 


Figure  5.  Reconstructions  of  the  curve  filtered  at  k  =  16 it.  Filtered  curve - ;  second-order 

reconstruction - ;  first-order  reconstruction . 


curve  very  well.  The  first-order  reconstruction  is  good  for  the  first  two  stages  but  then  moves 
further  and  further  away  from  the  actual  curve. 

5.  Conclusions  and  summary 

We  present  an  implementation  of  Milder’s  operator  expansion  algorithm  for  acoustic  scattering 
with  Dirichlet  boundary  condition.  We  modify  the  integral  used  by  Milder  to  ensure  that 
all  integral  operators  are  applied  to  compactly  supported  functions  and  integrations  are 
performed  on  bounded  sets.  Our  main  contribution  to  the  forward-field  calculation  has  been  the 
development  of  two  accurate  ways  of  implementing  the  N0  operator.  We  have  also  combined 
Milder’s  formalism  together  with  a  continuation  method  in  frequency  to  reconstruct  accurately 
rough  boundaries  with  rather  large  heights.  We  have  presented  examples  for  which  our 
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Figure  6.  Reconstructions  of  the  original  curve  with  k  =  327: .  Original  curve - ;  second-order 

reconstruction - ;  first-order  reconstruction . 


method  using  second-order  terms  works,  but  for  which  the  first-order  reconstruction  fails. 
Our  numerical  results  suggest  that  the  higher-order  approximation  errors  from  incident  fields 
having  higher  wavenumber  than  the  frequency  content  of  the  boundary  tend  to  remain  nearly 
constant  as  the  wavenumber  of  the  incident  field  increases. 

A  scheme  for  the  fast  evaluation  of  the  Helmholtz  potentials  can  be  added  to  accelerate 
the  algorithm.  Such  methods  are  currently  being  developed  by  several  authors. 
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Appendix 

In  this  appendix,  we  provide  a  detailed  derivation  of  the  kernel  of  the  convolution  operator  T2 
defined  in  section  3.2. 

From  (23)  we  obtain 

T2(f)(x)  =  -L-  f  f  iq(n){l  -  [1  -^(n)xT}e-iy.neu.,f(y)d  d 

yin )-  jR 2  Jr 2 

=  f  d)-/(>0— ^  f  iq{r]){\  -[1  -e^^rje^-^dt) 

Jn2  (27r)“  Ju2 

=  (  -y)f(y)dy  (55) 

J  R- 

where 

K(x)  =  — ~  f  ^(^{1  -  [i-e^>f)e^d  n 
{In)-  j® : 
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R  Coifinan  et  al 


m  /  x 

=  y^(-Dn+1^  ^jh(k.x.nxy) 


with 


h(k.x.xy)  =  pT)2  j,  M'7)eu'V‘?('m'  d^. 


We  note  that  h(k.  x,  xy)  can  also  be  expressed  as 


Hk'x-X3)  =  ak^?Le‘x 


r]^iq{r))xy 


d  n 
q(n) 


We  shall  use  the  spectral  form  of  the  free-space  Green’s  function,  see  [  13], 


exp[i£||A  -  K||] 


=  :_!_/■ 

2  (2,t)2 


exp[i(>  —  v)  •  r)  +  iq{rj)\x 3  —  v3|] 


d>7 

9(»7)' 


4tt||X  -  K|| 

Again,  since  .r3  is  positive,  setting  Y  =  0,  we  obtain 

enpIiiyWxj]  j  1  f  d„ 

where  .t"  =  xj-  +  Substitution  of  (60)  into  (58)  gives 

_^92  ( +  jtj] 

After  a  straightforward  calculation,  we  obtain: 


h(k.x.xy) 


h(k,  x,  xy) 


exp[ikJx2  +x^]  ( 

-  ~ 2  7=  |i*(jr  +  .rf)-|/2  -  (k2xl  +  1)(jt2  +  x2) 

AnJx2+x j  1 

— 3iA:jr|(jc2  +  x2)~3/1  +  3xj(x2  +  x2)~2  j . 


(56 1 


(57) 


(58) 


(59) 


(60) 


(61) 


(62) 
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RAPID  EVALUATION  OF  NONREFLECTING  BOUNDARY 
KERNELS  FOR  TIME-DOMAIN  WAVE  PROPAGATION* 

BRADLEY  ALPERTt ,  LESLIE  GREENGARD*,  AND  THOMAS  HAGSTROM5 

Abstract.  We  present  a  systematic  approach  to  the  computation  of  exact  nonreflecting  bound¬ 
ary  conditions  for  the  wave  equation.  In  both  two  and  three  dimensions,  the  critical  step  in  our 
analysis  involves  convolution  with  the  inverse  Laplace  transform  of  the  logarithmic  derivative  of  a 
Hankel  function.  The  main  technical  result  in  this  paper  is  that  the  logarithmic  derivative  of  the  Han- 
kel  function  of  real  order  v  can  be  approximated  in  the  upper  half  z-plane  with  relative  error 

c  by  a  rational  function  of  degree  d  ~  0(log \v\  log  ±  +  log2  log2  7)  as  M  —  oc,  e  —  0,  with 

slightly  more  complicated  bounds  for  v  —  0.  If  N  is  the  number  of  points  used  in  the  discretization  of 
a  cylindrical  (circular)  boundary  in  two  dimensions,  then,  assuming  that  e  <  1/Ar,  0(N\og  TV  log  -) 
work  is  required  at  each  time  step.  This  is  comparable  to  the  work  required  for  the  Fourier  trans¬ 
form  on  the  boundary.  In  three  dimensions,  the  cost  is  proportional  to  N2  log2  N  +  N2  log  N  log  - 
for  a  spherical  boundary  with  N2  points,  the  first  term  coming  from  the  calculation  of  a  spherical 
harmonic  transform  at  each  time  step.  In  short,  nonreflecting  boundary  conditions  can  be  imposed 
to  any  desired  accuracy,  at  a  cost  dominated  by  the  interior  grid  work,  which  scales  like  A2  in  two 
dimensions  and  Ar3  in  three  dimensions. 

Key  words.  Bessel  function,  approximation,  high-order  convergence,  wave  equation,  Maxwell’s 
equations,  nonreflecting  boundary  condition,  radiation  boundary  condition,  absorbing  boundary'  con¬ 
dition 

AMS  subject  classifications.  33C10.  41A20,  44A10,  44A35,  65D20 

PII.  S0036 1429983369 16 

1.  Introduction.  A  longstanding  practical  issue  in  numerical  wave  propaga¬ 
tion  and  scattering  problems  concerns  the  reduction  of  an  unbounded  domain  to  a 
bounded  domain  by  the  imposition  of  nonreflecting  boundary  conditions  at  an  arti¬ 
ficial  boundary.  We  restrict  our  attention  to  “time-domain”  calculations,  for  which 
it  is  well-known  that  the  exact  nonreflecting  conditions  are  global  in  both  space  and 
time.  W'hile  the  problem  has  been  widely  studied  (see  Givoli  [1]  for  an  overview), 
the  boundary  conditions  used  in  practice  typically  introduce  serious  numerical  arti¬ 
facts.  An  exception  is  the  method  developed  by  Ting  and  Miksis  [2],  which  relies 
on  KirchhofFs  formula  to  solve  the  wave  equation  in  an  exterior  domain,  but  which 
is  computationally  expensive.  The  two  most  common  approaches  are  based  on  the 
construction  of  local  differential  boundary  conditions  [3,  4]  or  absorbing  regions  [5,  6], 
but  neither  provides  a  clear  sequence  of  approximations  which  converge  to  the  exact, 
nonlocal  conditions.  Recently,  Soffonov  [7]  and.  independently,  Grote  and  Keller  [8] 
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have  developed  and  implemented  an  integrodifferential  approach  for  three-dimensional 
calculations  using  a  spherical  boundary  and  have  demonstrated  that  high  accuracy 
can  be  achieved  at  reasonable  cost.  In  their  schemes,  the  work  is  of  the  same  order 
as  the  explicit  finite  difference  or  finite  element  calculation  in  the  interior  of  the  do¬ 
main.  For  N2  points  on  the  spherical  boundary.  0(N3)  work  is  required.  Hagstrom 
and  Hariharan  [9]  have  shown  that  these  conditions  can  be  effectively  implemented 
using  only  local  operators,  but  at  the  cost  of  introducing  a  large  number  of  auxiliary 
functions  at  the  boundary.  A  somewhat  more  general,  but  closely  related,  integral 
formulation  is  introduced  in  [10.  11.  12].  The  fundamental  analytical  tool  in  the  latter 
papers  is  what  we  refer  to  as  the  nonreflecting  boundary  kernel  which  is  the  inverse 
Laplace  transform  of  the  logarithmic  derivative  of  a  Hankel  function. 

In  this  paper,  we  prove  that  the  logarithmic  derivative  of  a  Hankel  function  can  be 
approximated  as  a  ratio  of  polynomials  of  modest  degree,  so' that  its  inverse  Laplace 
transfoi  m  can  be  expressed  as  a  sum  of  exponentials.  Our  analytical  approach  com¬ 
bines  an  extension  of  the  Mittag-Leffler  theorem  with  the  approximation  techniques 
of  the  fast  multipole  method.  In  particular,  Theorem  4.1  presents  an  exact  represen¬ 
tation  of  the  logarithmic  derivative  as  a  sum  of  poles  plus  a  continuous  density  on  the 
branch  cut.  Theorem  4.6,  which  is  preceded  by  several  technical  lemmas,  presents  a 
reduced,  approximate  representation. 

Using  this  approach,  the  cost  of  computing  the  nonreflecting  boundary  condition 
is  comparable  to  that  of  a  fast  Fourier  or  spherical  harmonic  transform.  For  two- 
dimensional  problems,  0(N\og Nlog  j)  work  is  required  at  each  time  step,  where  N  is 
the  number  of  points  used  in  the  discretization  of  a  cylindrical  (circular)  boundarv.  In 
three  dimensions,  the  cost  is  proportional  to  N2  log2  jV  +  N 2  log  N  log  ±  for  a  spherical 
boundary  with  N2  points.  The  first  term  comes  from  the  calculation  of  the  spherical 
harmonic  transform  using  the  fast  algorithm  of  [13,  14]. 

Other  authors,  including  Nedelec  [15]  and  Cruz  and  Sesma  [16],  have  studied 
the  logarithmic  derivative  of  the  Hankel  function,  based  on  a  variety  of  techniques. 
In  this  paper  we  present  a  sum-of-poles  representation  for  the  logarithmic  derivative 
of  a  Hankel  function  of  real  order  v  bounded  away  from  zero  with  accuracy  £  for 
argument.  2.  satisfying  Im(r)  >  0.  The  number  of  poles  is  bounded  by  0(  log  |^[  * 
log  \  -Flog2  \i/\  -f- \v\~l  log2  j).  A  similar  representation  for  1/  =  0  is  also  derived  which 

is  valid  for  lm(z)  >  77  >  0  requiring  O  ( log  ±  •  log  ±  +  log  ±  •  log  log  i  +  log  1  •  log  log  ± ) 
poles.  e  v  v 

In  section  2.  we  introduce  nonreflecting  boundary  kernels.  In  section  3  we  collect 
background  material  in  a  form  convenient  for  the  subsequent  development.  Section  4 
contains  the  analytical  and  approximate  treatment  of  the  logarithmic  derivative,  while 
a  procedure  for  computing  these  representations  is  presented  in  Section  5.  The  re¬ 
sults  of  our  numerical  computations  are  contained  in  section  6,  and  we  present  our 
conclusions  in  section  7. 

2.  Nonreflecting  boundary  kernels.  Let  us  first  consider  the  wave  equation 
(2-1)  u„=c2V2u 

in  a  two-dimensional  annular  domain  Po  <  p  <  pi-  The  general  solution  can  be 
expressed  as 

00 

u(p,dA)=  eind,C-l{an{s)Kn{ps/c)  +  bn(s)In{ps/c)}{t), 


(2.2) 


n=  — oc 
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where  Kn  and  In  are  modified  Bessel  functions  (see,  for  example,  [17,  section  9.6]). 


Kn(z)  =  lin+1H^(ze^),  In(z)  =  rnJn(ze*i/2),  -*■  <  arg r  < 

the  coefficients  an  and  bn  are  arbitrary  functions  analytic  in  the  right  half-plane.  L 
denotes  the  Laplace  transform 


£[/](*) 


7(0*. 


and  L  1  denotes  the  inverse  Laplace  transform 


-I  rioc 

£_1[s](*)=2 —  y.  estg(s)ds. 


Likewise,  for  the  wave  equation  in  a  three-dimensional  domain  r0  <  r  <  n,  the 
general  solution  can  be  expressed  as 


o. >.,)  =  £  £  w*,*) £-  L(.)  + bnm(s) >' 

n~  —  oom—  —  n  \/TS/C  yTsjc 


n=  —  oc  m—  —  n  [_  V'*/c  \/rS/C  J 

If  we  imagine  that  p  =  pi  (or  r  =  q)  is  to  be  used  as  a  nonreflecting  boundary, 
then  we  can  assume  there  are  no  sources  in  the  exterior  region  and  the  coefficients 
bn(s)  (or  bnm(s))  are  zero.  Let  us  now  denote  by  un(p, t)  the  function  satisfving 


C[un}{p,  s )  =  an(s)  Kn(ps/c). 


so  that 


r  d  i  5 

i~dpUrxrP'S^  =  an^5^  *  c  'K'^ps/C} 


yun{p,t)  =  un{P.t)*c-1  [- 
op  c  Kn(ps/c) 


where  *  denotes  Laplace  convolution 


(2.10) 


(/  *  9)(t)  =  f  f(T)g{t-T)dr. 
Jo 


The  convolution  kernel  in  (2.9)  is  a  generalized  function.  Its  singular  part  is  easily 
removed,  however,  by  subtracting  the  first  two  terms  of  the  asymptotic  expansion 

(2.11)  £^Wc)  s 

'  cKn(ps/c)  c  2p+U{&  h  S^°°- 
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From  the  assumption  un(p.t)  =  0  for  t  <  0  and  standard  properties  of  the  Laplace 
transform  we  obtain  the  boundary  condition 

(2.12)  £-un(p,t)  +  ~un(p.t)  +  ±un(p.t)  =  an(t  —  r)un(p.r)dr. 

where 


(2.13) 


On(t)=C~l 


s 

c 


J_  +  sK'„(ps/c) 
2p  cKn(ps/c ) 


which  we  impose  at  p  —  px. 

Remark.  The  solution  to  the  wave  equation  in  physical  space  is  recovered  on  the 
nonreflecting  boundary  from  un  by  Fourier  transformation: 


A72-1 

(2-14)  u(pi,0,t)  =  un(Put)  eino , 

n=-N/2 

assuming  N  points  are  used  in  the  discretization. 

The  analogous  boundary  condition  in  three  dimensions  is  expressed  in  terms  of 
the  functions  u„m(r.t )  satisfying 


(2-15)  £[unm](r,  s)  =  anm(s) 

After  some  algebraic  manipulation,  assuming  u 


Kn+i(rs/c) 
y/‘ rsjc 

nm  (r,  t)  =  0  for  t  <  0,  we  have 


d  Id  1  rl 

(2-16)  -Q^unm(T.  t)  +  -  —  unm(r,t)  +  -unm(r,  t)  —  /  u>n(t-T)unm(r,T)dT. 

J  o 

where 


(2.17) 


^n(t)  =  C~l 


s_  J_ 

c  2r  cA'n+i(rs/c) 


(*), 


which  we  impose  at  r  =  rx. 

Note  that  the  boundary  conditions  (2.12)  and  (2.16)  are  exact  but  nonlocal,  since 
they  rely  on  a  Fourier  (or  spherical  harmonic)  transformation  in  space  and  are  history 
dependent.  The  form  of  the  history  is  simple,  however,  and  expressed,  for  each  sepa¬ 
rate  mode,  in  terms  of  a  convolution  kernel  which  is  the  inverse  Laplace  transform  of 
a  function  defined  in  terms  of  the  logarithmic  derivative  of  a  modified  Bessel  function 


(2.18) 


dz 


log  K„(z) 


KM 

Ky(zY 


Remark.  In  three  dimensions,  the  required  logarithmic  derivative  of  Kn+i(z)  is 
a  ratio  of  polynomials,  so  that  one  can  recast  the  boundary  condition  in  terms  of  a 
differential  operator  of  order  n.  The  resulting  expression  would  be  equivalent  to  those 
derived  by  Sofronov  [7]  and  Grote  and  Keller  [8]. 

The  remainder  of  this  paper  is  devoted  to  the  approximation  of  the  logarithmic 
derivatives  (2.18)  as  a  ratio  of  polynomials  of  degree  0(logz/),  from  which  the  convo¬ 
lution  kernels  crn  and  u;„  can  be  expressed  as  a  sum  of  decaying  exponentials.  This 
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representation  allows  for  the  recursive  evaluation  of  the  integral  operators  in  (2.12) 
and  (2.16).  using  only  O(logn)  work  per  time  step  (see  [18]).  We  note  that,  by  Par- 
seval's  equality,  the  L2  error  resulting  from  convolution  with  an  approximate  kernel 
is  sharply  bounded  by  the  Lx  error  in  the  approximation  to  the  kernel's  transform. 
Precisely,  approximating  the  kernel  B(t)  by  the  kernel  A(t)  we  find 


^A*u  —  B  *u  2  =  ||j4u  —  Su||,  <  sup 


(2.19) 


[  A-B\ 

s€i  K  |5| 

A-B\ 


=  sup  — ,  ^ , 

\B\ 


IMa 

IIs*  uIL 


where  we  assume  that  A,  B,  and  u  are  all  regular  for  Re(s)  >  0.  For  finite  times  we 
may  let  5  have  a  positive  real  part,  77: 


(2.20) 


II A  *  u  —  B  *  u\ 


L2(0.T) 


< 


e 7)7  sup 

s£i 7  +  iR 


lizil 

|B| 


I B  *  u\ 


l2{o,ty 


We  therefore  concentrate  our  theoretical  developments  on  approximations.  For 
ease  of  computation,  however,  we  compute  our  rational  representations  by  least 
squares  methods.  These  do  generally  lead  to  small  relative  errors  in  the  maximum 
norm,  as  will  be  shown. 

Since  Hankel  functions  are  more  commonly  used  in  the  special  function  literature, 
we  will  write  the  logarithmic  derivatives  as 


(2.21) 


-  log  Ku(z)  =  ±  log  H™  (z  e^2)  =  i 


(2e7Tt/2) 


dz 


hY  (z  eW2) 


We  are,  then,  interested  in  approximating  logarithmic  derivative  of  the  Hankel  func- 
tion  on  and  above  the  real  axis. 


3.  Mathematical  preliminaries.  In  this  section  we  collect  several  well-known 
facts  concerning  the  Bessel  equation,  the  logarithmic  derivative  of  the  Hankel  func¬ 
tion,  and  pole  expansions,  in  a  form  that  will  be  useful  in  the  subsequent  analytical 
development. 

3.1.  Bessel’s  equation.  Bessel’s  differential  equation 


(3.1) 


u  =  0, 


for  v  6  R,  has  linearly  independent  solutions  HY  and  Hi21 ,  known  as  Hankel’s 
functions.  These  can  be  expressed  by  the  formulae 


(3.2) 


H(i){z)  =  ■/-,(;)- e— 4(2) 

isin(i/-) 


2)(2)  =  J-v{z)  ~e^\JJz) 
isin(^7r) 


where  the  Bessel  function  of  the  first  kind  is  defined  by 


•W-(5)T 


k= 0 


(~z2 /A)k 
/c!T(i/  +  A;  +  1)  ‘ 


(3.3) 
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The  expressions  in  (3.2)  are  replaced  by  their  limiting  values  for  integer  values  of  u. 
(See.  for  example.  [17.  section  9.1].)  For  general  v ,  the  functions  Hjl)  and  H(2]  have 
a  branch  point  at  z  =  0  and  it  is  customary  to  place  the  corresponding  branch  cut 
on  the  negative  real  axis  and  impose  the  restriction  -tt  <  argc  <  We  shall  find 
it  more  convenient,  however,  to  place  the  branch  cut  on  the  negative  imaginary  axis, 
with  the  restriction 

(3.4)  _T<argr<^l 

Hankel  s  functions  have  especially  simple  asymptotic  properties.  In  particular  (see, 
for  example,  [19,  section  7.4.1]). 


(3.5)  ffW(r)  -  (Ay/2e><— /2-r/4)  yy- 

U  ~  k—0 

(3.6)  ~  (  —  )  1/2e«(-'-^/2-V4)  y  {k  + 

\~z/  ^  zk  \  2z  z) 

k—0 

as  z  — *  oo,  with  -tt  +  8  <  arg  2  <  2tt  -  <5,  where 


(3.7) 


AkW)  = 


(4^2  —  12)(4i^2  —  32)  •  •  •  (4z/2  —  (2Jfc  -  l)2) 


k\8k 


and  the  branch  of  the  square  root  is  determined  by 


(3.8)  21/2  =  e0°g|j|+targ2)/2_ 

Finally  we  note  the  symmetry 


(3-9)  H^(z)  =  e-^'H^iz). 

We  also  make  use  of  the  modified  Bessel  functions  K„{z)  and  I„(z).  These  are  lin¬ 
early  independent  solutions  of  the  equation  obtained  from  (3.1)  by  the  transformation 
z  — •  iz.  Their  Wronskian  satisfies 


(3-10)  K(z)Il(z)  - 

Moreover  we  have  for  positive  r  [20] 


(3.11) 

Hll)(re  *~/2)  = 

Asymptotic  expansions  of  Ku(r)  and  Iv 
sections  9.6  and  9.7].  For  real  r  and  v  > 

(3.12) 

/7-iog-, 

v  2  (2/  1 

(3.13) 

«')~f <£«(£)'• 

(3.14) 

KAr)~^e~', 

(3.15) 

Kl(z)L(z)  =  z-1. 

(el'~i  Ku(r)  4-  7ri/l/(r)). 

-(r)  for  r  small  and  large  are  also  known  [17, 
*  0  we  have 

*  =  0, 

r  0, 

v  >  0. 

r  — >  0, 
r  — +  oo, 

r  — ^  oo. 
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Here  7  =  0.5772 ...  is  the  Euler  constant. 

Finally,  we  note  the  uniform  expansions  of  Bessel  functions  for  v  — »  oc  given  in 
[17].  For  Hankel  function  and  derivative  we  have 

(3.16)  m**)  ~  2e-/3(riL.)1^A1(^;f,0. 

(3.17)  gO'(„.;..  j!~2T‘/3f  <C  1-viAiV^VAc) 

^  V  '  2  V 1  —  r2/  n2/3 

as  i/  —  oc.  where  we  restrict  2  to  |arg(z)|  <  tt/2  and  define 

(3.18)  lc3/2  =  log  l±LEZ  _ 

O  2 

Here,  Ai(f)  denotes  the  Airy  function  [17.  section  10.4).  Note  that  £  =  0  when  2  =  1. 
Large  v  approximations  of  the  modified  Bessel  functions  for  real  arguments,  r,  are 
given  by 


(3.19)  Kv{yr) 


7r  e 


V  2v  (i+r2)V 


1771  L{vr) 


l  gi/  4>{t 


™  (1  +  r 2)1/4’ 


(3.20)  <f>(r)  =  log -  +  y/ 1  +  r2. 

1  +  \/l  +  r2 

3.2.  Hankel  function  logarithmic  derivative.  We  denote  the  logarithmic 
derivative  of  bv  Gu. 


G„(2)=dlogff<.,(2)=^a, 


-  '  Hl%)' 

The  following  lemma  states  a  few  fundamental  facts  about  Gu  that  we  will  use  below. 
Lemma  3.1.  The  function  Gu(z),  for  v  e  R,  satisfies  the  formulae 

(3-22)  GL„(*)  =  G„(2), 


Gl/(ze7ri)  =  Gu{z)  i 


7T  7T 

~2  <  argz  —  2  ’ 


where  2  —  |z|  e  targJ:  is  the  complex  conjugate  of  2.  Asymptotic  approximations  to 
G^  are 


(3.24)  Gu(z) 


'  (log(ze~*i/2/2)  +  7)_1  2_1  +  0(2),  n  =  0, 
-|j/|2-1  +  0(22I‘/I-1),  0<|i/|<1, 

-M2-1  +  0(zlogz),  M  =  1, 

.-|n|2-x  +0(2),  M  >  1, 


2  — >  0, 


M  >  1, 


where  7  is  the  Euler  constant , 


tAfc(n)/  1  fc\  ,kAk(v) 


2 -z+'--z)/^-fk 

k= 0 
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FlG.  3.1.  Curve  z(0  defined  by  (3.18)  near  which  the  scaled  zeros  of  H[}'1  he  ( see  Lemma  3.2). 
The  branch  cut  of  Hi  ^  is  chosen  (3.4)  on  the  negative  imaginary  axis. 


where  Ak(u)  is  defined  in  (3.7),  and 


(3.26) 


Gv[yz) 


2e_7ri/3  (  4C  ^  -i/2  Ai'(e2“'/'3i/2/,3£) 

‘  Ai(e2irt/32.2/3C)  ’ 


i/VSj 


(A) 


V  — ►  oc, 


where  (  is  defined  in  (3.18).  Furthermore,  the  function  uv  defined  by 
(3-27)  u„(z)  =  zGu{z) 

satisfies  the  recurrence 


(3.28) 


u„(z)  - 


v  -l  -  Uv-^Z) 


v. 


Proof.  Equations  (3.22)  and  (3.23)  and  asymptotic  expansion  (3.24)  follow  im¬ 
mediately  from  the  definitions  (3.2)  through  (3.4)  of  J„  and  Hl1].  The  asymptotic 
expansion  (3.25)  follows  from  (3.5)  and  (3.6),  while  (3.26)  is  a  consequence  of  (3.16) 
and  (3.17).  The  recurrence  (3.28)  from  standard  Bessel  recurrences  [17,  section 
9.1.27],  0 

The  zeros  of  Hll\z)  are  well  characterized  [17,  20];  they  lie  in  the  lower  half  z- 
plane  near  the  curve  shown  in  Figure  3.1,  obtained  by  transformation  [21]  of  Bessel’s 
equation.  In  terms  of  the  asymptotic  approximation  (3.16),  this  curve  corresponds  to 
negative,  real  arguments  of  the  Airy  function. 

Lemma  3.2.  The  zeros  hu\,hu .2,  • .  •  of  H[}\z)  in  the  sector  — tt/2  <  arg  z  <  0 
are  given  by  the  asymptotic  expansion 


(3.29) 


hi/.n  ~  ^^(Cn)  +  0(l/  *  )  , 


V  — » •  OO, 

n  =  1,  •  • . ,  [M/2  +  1/4J, 


uniformly  in  n,  where  is  defined  by  the  equation 


(3.30) 


Cn  = 


z(C)  is  obtained  from  inverting  (3.18),  and  an  is  the  nth  negative  zero  of  Airy  function 

Ai.  The  zeros  in  the  sector  n  <  argz  <  3n/2  are  given  by  _  In 

particular , 


(3.31) 

where  —a\  =  2.338 _ 


K,i~v  +  e  2ni/3(u/2)1/3(~ai), 
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All  m  terms  of  the  first  summation  vanish,  due  to  the  combination  of  (3.34)  and  the 
equality  YlJLo  ~  m^ko-  For  the  error  term  we  obtain 


<h-i 

I- 1  -  -> 


j=  1  "  " 

,2 


< 


l9j! 


J  =  1  1  -  Zj/z 


< 


_  (*  ~Q  1)lqj!  ^  q2  +  1  LR,/y  I?;  i 

—  l+u-2  -  („- iy-\z\Re[^  l-zj/r. 


1*1  (0  -  l)2 


1 


(3.39) 

and 

(3.40) 


< 


a2  +  1 

(a  -  l)2 


Elgjj 

2  —  5 


j  =  l 


a2  + 1 

(^i)1 


WON 


Y  ^uj3m 

m~  1 

Uz~u: 

„  z  - 
;=0 

—  |5m(2)|. 


Moreover,  repeating  the  computations  of  (3.39).  we  find 


(3'41)  l/(*)l  <  ^p|F(Z)|. 

Now  the  combination  of  (3.38)  through  (3.41)  and  the  triangle  inequality  gives 
(3.35).  □ 

Inequality  (3.35)  remains  valid  if  we  assume  instead  that  \zj\  <  b  and  Re(z)  = 
ab  >  6,  for  arbitrarj  b  €  K,  b  >  0;  this  fact  leads  to  the  next  two  results  whose  proofs 
mimic  that,  of  Lemma  3.3  and  are  omitted. 

Lemma  3.4.  Suppose  n,p  are  positive  integers .  Qi*  •  * .  ,qn  are  complex  numbers , 

an&  z\ . zn  are  complex  numbers  contained  in  disks  Di, _ Dp  of  radii  rlt...,r 

centered  at  ci, . . . ,  cp,  respectively .  The  function 


(3-42)  f{z)  =  j2^~ 

3  =  1  2  Zj 

can  be  approximated  for  z  satisfying  Re(r  -  c,)  >  art  >  r,  for  i  =  1 . pbythem-p 

pole  expansion 


(3.43) 


P  m  —  1 


gm(z)  = 


=  1  j=0 


_ 

(c,  +  rl  U}3)' 


where  jij  is  defined  by 


(3.44) 


m  —  1 


Ifij 


m 


-ji 


/= o 


E 


■ 


zk€D,\U,-i 


i= 

j  =  0, ...,  m  -  1, 


with  Uj  -  U j<iDj.  The  error  of  the  approximation  is  bounded  by 


(3.45) 


\m~9m(z)\  < 


2(q2  +  1)1F(z)[ 
{am  -  1  )(a  -  l)2’ 
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where 


(3.46)  F(z)  =  f2—' 

iZ~£] 

Lemma  3.5.  Suppose  that  the  discrete  poles  of  Lemma  3.4  are  replaced  with  a 
density  q  defined  on  a  curve  C  with  C  C  Up  =  D\  U  •  •  •  U  Dp.  specifically 


(3.47) 


which  is  finite  for  z  outside  Up,  and  that  gm  is  defined  by  (3.43)  with  ')1]  defined  by 


(3.48) 


nj 


4eW  «(o(^)V, 

m  /=0  J Cn(Di\Ut-i)  V  n  * 


i  =  1, . . . ,  p, 
j  =  0, . . . ,  m  -  1, 


with  Ul  —  U j<iDj.  Then  the  bound  (3.45)  holds  as  before .  Lemma  3.3  enables  us 
to  approximate,  with  exponential  convergence,  a  function  defined  as  a  sum  of  poles. 
The  fundamental  assumption  is  that  the  region  of  interest  be  “separated”  from  the 
pole  locations.  The  notion  of  separation  is  effectively  relaxed  by  covering  the  pole 
locations  with  disks  of  varying  size  in  an  adaptive  manner.  In  Lemmas  3.4  and  3.5, 
we  use  this  approach  to  derive  our  principal  analytical  result. 

4.  Rational  approximation  of  the  logarithmic  derivative.  The  Hankel 
function’s  logarithmic  derivative  Gv(z)  defined  in  (3.21)  approaches  a  constant  as 
2  ->  oc  and  is  regular  for  finite  z  <E  C,  except  at  2  =  0,  which  is  a  branch  point,  and  at 
the  zeros  of  Hi  \z),  all  of  which  are  simple.  We  can  therefore  develop  a  representation 
for  Gv  analogous  to  that  of  the  Mittag-Leffler  theorem;  the  only  addition  is  due  to 
the  branch  cut  on  the  negative  imaginary  axis.  It  will  be  convenient  to  work  with 
uu(z),  defined  in  (3.27).  for  which  approximations  to  be  introduced  have  simple  error 
bounds. 

THEOREM  4.1.  The  function  uu(z)  =  zG^z),  where  Gv  is  defined  for  v  G  R  by 
(3.21)  with  the  branch  cut  defined  by  (3.4),  is  given  by  the  formula 


(4.1) 


1x^(2)  — 


1 4.  y  _  1 

2  “  2  —  hu,n  iri 


\m{uv{re  ™/2)) 


n  =  1 


ir  +  z 


dr 


for  2  €  C  not  in  {0,  hu •  •  • ,  }  and  not  on  the  negative  imaginary  axis . 

Here  /vi,  hu^,  •  ■  •  ?  denote  the  zeros  of  Hll\z),  which  number  Nu. 

Proof  The  case  of  the  spherical  Hankel  function,  where  v  =  k  4-  1/2  for  k  €  Z, 
is  simple  and  we  consider  it  first.  Here  uw(z )  is  a  ratio  of  polynomials  in  iz  with  real 
coefficients,  which  is  clear  from  the  observation  that  u1/2(z)  =  22 -1/2  in  combination 
with  the  recurrence  (3.28).  Hence 


(4-2)  M*)=P(*)  +  i;-2!£-, 

n  =  l  2  Al/'n 

where  p  is  a  polynomial  and  a„.„  is  the  residue  of  uu  at  hu<n, 
(4-3)  a„.n  =  Jim  (z  -  hv,n)  up(z)  =  hPM 
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Im(c) 


by  EHopital’s  rule.  We  see  from  (3.25)  that 

(4-4)  uu(z)  ~  iz  -  ^  2  ->  oc, 

whence 


(4-5)  p{z)  =  iz-^. 

Noting  that  u„{iy)  €  E  for  y  6  R.  and  combining  (4.2).  (4.3).  and  (4.5).  we  obtain 
(4.1).  _ 

W'e  now  consider  the  case  v  fc+1/2.  k  €  Z,  for  which  the  origin  is  a  branch  point. 
For  m  =  1.2,...,  we  define  Cm  to  be  the  simple  closed  curve,  shown  in  Figure  4.1, 
which  proceeds  counterclockwise  along  the  circle  C^]  of  radius  m  +  l  centered  at  the 
origin  from  arg2  =  -tt/2  to  3tt/2,  to  the  vertical  segment  2  =  re3*'/2,  r  €  [1/m,  m+l], 
to  the  circle  C of  radius  1/m  centered  at  the  origin  from  arg2  =  3tt/2  to  -7t/2,  to 
the  vertical  segment  2  =  re-"'/2,  back  to  the  first  circle.  Since  none  of  the  zeros  of 
lies  on  the  imaginary  axis,  Cm  encloses  them  all  if  m  is  sufficiently  large.  For 
such  m,  and  2  £  C  inside  Cm  with  Hll\z)  /  0,  the  residue  theorem  gives 


(4.6) 


MO 
C  —  z 


Ar„ 

a!C  =  ul/(z)  +  ^ 

n=l 


We  now  consider  the  separate  pieces  of  the  contour  Cm.  For  the  circles  C&}  and  C%\ 
we  use  the  asymptotic  expansion  (4.4)  about  infinity  and  (3.24)  about  the  origin  to 
obtain 


lim  —  [ 

771  ♦  oc  2tT  2  Jcm  Q  —  Z 


dQ 


1 

r 


lim 

771  — *  OC 


J_  [  MO 

2717  Jc(2)  £  —  2 


d(,  =  0. 


(4.7) 
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Fig.  4.2.  Plot  of  Re (u„ (re  ’rt//2)) ,  containing  the  zero  crossing,  and  Im(ui/(re_,r'/2)) ,  for 
v  —  2  and  r  £  [0.  3]. 


Now  exploiting  the  symmetry  ujre3*1/2)  =  ujre~*'/2)  from  (3.23)  for  the  vertical 
segments,  we  obtain 


(4.8) 


lim  f- 

m—> oc  Z7TZ 


uJX)  ,r  1  1  fx  2ilm(uv(re  7ri /2)) 

C  -  -2  %Z  2  +  2m  J0  (re-*'/2  -  z) 


e~’ni/2  dr, 


which,  when  combined  with  (4.6),  yields  (4.1)  and  the  theorem.  0 

The  primary  aim  of  this  paper  is  to  reduce  the  summation  and  integral  of  (4.1) 
to  a  similar  summation  involving  dramatically  fewer  terms.  To  do  so,  we  restrict  z 
to  the  upper  half-plane  and  settle  for  an  approximation.  Such  a  representation  is 
possible,  for  the  poles  of  uu  (zeros  of  H^)  lie  entirely  in  the  lower  half-plane  and  do 
not  cluster  near  the  real  axis.  We  first  examine  the  behavior  of  uv  on  the  negative 
imaginary  axis. 

The  qualitative  behavior  of  u„  on  the  branch  cut  is  illustrated  by  the  case  of 
i/  =  2,  shown  in  Figure  4.2.  The  plot  changes  little  with  changing  v,  except  for  the 
sign  of  Im(u^(z))  and  the  sharpness  of  its  extremum. 

Lemma  4.2.  For  v  £  R,  v  ^  /c  +  1/2,  k  £  Z,  the  function  uv(re~ *1/2)  is  infinitely 
differentiable  on  r  €  (0,  oo)  and  has  imaginary  part  satisfying  the  following  formulae: 


(4.9) 


Im(u„( 


re 


-nt/2))  = 


ncos(vir) 


cos2(i/Tt)K2(r)  +  (-7T  /„(r)  +  sin(^7r  )Kiy(r))‘ 


^  o, 


f  * 

t/  =  0, 

"  #  0, 

(4.10) 

Im(ui/(re  ^Z2)) 

~  < 

|  (log(r/2)  -F7)2  +  7 r2’ 

|  7rcos(z/7r)  2M 

r  — >  0, 

t4H-1T(|i/|)2  r  ’ 

(4.11) 

Im(uI/(re"7rz/2)) 

~  2  cos(i/7r)  re  2r, 

r  — ^  oo. 

(4.12) 

Im(tzI/(re~7ri/2)) 

cos(i/7r)y/r 2  -f  v2 

\v\  — >  oc, 

cosh  (2v  <t>(r/\v\))  +  sin( 

v \n) 1 

where  d 

>  is  defined  in  (3.20). 
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Proof.  Infinite  differentiability  of  uu(z)  follows  from  the  observation  that  H(J\z) 
#  0  on  the  negative  imaginary  axis.  To  derive  (4.9)  we  recall  (3.11)  to  obtain 

(4.13)  W,,,  (r -  —  r~C0S(t/~)(^^r)^(r)  -  K'v(r)Iv(r)) 

cos2(z/m  )A'2(r)  -r  (nlu(r)  4-  sin(*/7r) A'„(r))2 

then  apply  (3.10).  The  remaining  formulas  follow  from  the  asymptotic  forms  of  Ku(r) 
and  Iu(r)  for  small  and  large  r,  and  the  uniform  large  v  expansions  given  in  (3.12) 
through  (3.15)  and  (3.19).  Here  we  use  the  symmetry  u-v  —  uv .  Note  that  (4.10)  is 
valid  for  r/\u\  — ►  0.  The  approximation  (4.12)  is  nonuniform  for  v  %  2k  -  1/2  and 
irlt,(r)  4-  sin(i/7T )Ku{r)  kO.  □ 

Lemma  4.3.  Given  Vq  >  0  there  exist  constants  cq  and  cy  such  that  for  all  v  £  R. 
M  >  ^  /  k  4- 1/2,  /c  E  Z,  and  all  z  satisfying  Im(z)  >  0.  the  function 

(4.14)  /(*)=  P  -^(fe'T,/2))  dr 

7o  zr  -f  2 

satisfies  the  bounds 


(4.15) 


i  +  M/H 


<  !/(*)!  < 


Cl 

i  +  N/k 


Moreover ,  there  exists  6  >  0  suc/i  that  for  all  i /gR,  |i/|  >  i/0,  and  e  with  0  <  £  <  1/2, 
/(z)  admits  an  approximation  g{z)  that  is  a  sum  of  d  <  ^ •  (l  +  |ry|-1  log(l /•=■)) -log(l/s) 
poles,  with 


(4-16)  l/(2)  -9(^)1  <£-  l/(-)|. 

provided  Im(^)  >  0. 

Proof  We  assume  u  7^  k  -f  1/2  for  integral  k  and  begin  by  changing  variables, 
r  =  \v\w,  so  that 


(4.17) 


m 


i; 


lm(ul/(\i/\we~7Ti/2)) 
iw  +  zf\v\ 


dw 


f 


fiz(w)dw. 


From  the  nonvanishing  of  pz  and  its  asymptotic  behavior  in  w,  it  is  clear  that  (4.15) 
holds  foi  \u\  E  (j'o-Z'i)  and  any  fixed  V\  >  uq.  Using  (4.12)  for  \v\  large  but  bounded 
away  from  2k  -  1/2  for  integral  k.  an  application  of  Watson’s  lemma  to  (4.14)  focuses 
on  the  unique  positive  zero.  w\  of  0  defined  in  (3.20).  As  the  derivative  of  this 
function  is  positive,  we  conclude 


(4.18) 


a  cos(z/tt) 
iw*  4-  z/\v\  1 


where  a  is  a  function  of  w* ,  so  that  (4.15)  clearly  holds.  However,  as  v  ->  2k  -  1/2, 
the  denominator  on  the  right-hand  side  of  (4.12)  may  nearly  vanish  at  w*  and  the 
expansion  loses  its  uniformity.  Setting  cos(i/?r)  =  r?  in  these  cases,  we  see  that  the 
denominator  has  a  minimum  which  is  bounded  below  by  0(t]2).  Hence  in  an  0(\v\~l) 
neighborhood  of  the  minimum  which  includes  w* ,  we  have 


r?MyT  4-  {w*)2 
iw *  4-  zf\u\ 


nl  M 


-7/M  V2  +  &2V2S2 


ds , 


(4.19) 
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which  by  the  change  of  variables  s  =  T]z/\v\  is  seen  to  satisfy  the  upper  bound  in 
(4.15)  uniformly  in  77.  As  the  rest  of  the  integral  is  small,  the  upper  bound  holds. 

We  now  move  on  to  the  approximation.  For  a  positive  integer  m  and  a  positive 

number  w0.  we  define  intervals  I0  =  (0,wo),  I3  =  (2j-1w0,2jw0)  for  j  =  1 . m. 

and  /m+i  =  ( 2mwo,oo ).  Now 

(4-2°)  n*)  =  fo(z)+fi(z)+f2(z), 


where  fo-fi  -  and  /2  are  defined  by  the  formulae 
(4.21) 

M*)=  Vz(w)  dw.  fi(z)  =  T  /  Hz{w)  dw,  fi(z)=  [  pJw)dw. 
Jl°  j=i4 

We  will  now  choose  w0  and  m  so  that  f0  and  /2  can  be  ignored  and  then  use  Lemma 
3.5  to  approximate  /1 .  Using  (4.10)  and  (4.12)  and  talcing  wo  sufficiently  small  we 
have,  for  some  constant  c2  independent  of  1/, 


(4.22)  \f0(z)\  < 

Hence,  a  choice  of 


C2M 

1  +  N/M 


< 


C2 

i  +  W/M 


(4.23) 


Wo  =  0(s1/(2|t/|)),  £  — »  0, 


suffices  to  guarantee 

(4-24)  l/o(*)|  <  ||/(*)| 

in  the  closed  upper  half- plane.  Now  using  (4.11)  and  (4.12)  and  assuming  m  suffi¬ 
ciently  large  we  have,  for  some  constant  c3  independent  of  v, 


(4.25)  \f2(z)\  < 


c3|^| 


i  +  W/M 


f 


Co 


we~Mwdw  < 

U.-0  1  +  \z\. 


72mwoe~W2’nu’0 . 


From  (4.23),  choosing 


1  1 

(4-26)  m  >  mo  +  mi  —  log  - 

M  £ 

for  appropriate  mo  and  mj  independent  of  v  and  e  leads  to 


(4-27)  l/2fy)l  <  ||/fy)|- 

Finally,  we  apply  Lemma  3.5  to  the  approximation  of  f \ .  The  error  involves  the 
function  Fi  =  f  |Im(u„)|/(ir  +  z)dr,  but  we  note  that  |F\|  =  |/j|.  Using  p  poles  for 
each  j  we  produce  a  p  ■  m-pole  approximation  g(z )  with  an  error  estimate,  again  for 
Im(z)  >  0,  given  by 


5 

3p  -  1 


(4.28) 
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A  choice  of 

(4.29)  p  =  C>(log*) 

enforces 


(4-30)  1/iW -<?(*)!<  §!/(;)!■ 

By  combining  (4.24).  (4.27).  (4.30),  and  the  triangle  inequality,  we  obtain  (4.16)  with 
the  number  of  poles,  d  =  p  •  m.  satisfying  the  stated  bound.  0 

The  case  v  =  0  requires  special  treatment.  First,  the  direct  application  of  the 
preceding  arguments  leads  to  a  significantly  larger  upper  bound  on  the  number  of 
poles.  Second,  we  note  that  Uo(0)  =  0,  so  that  relative  error  bounds  near  z  =  0 
require  a  vanishing  absolute  error.  Finally,  the  lack  of  regularity  of  u0fy)  at  c  =  0 
precludes  uniform  rational  approximation,  as  discussed  in  [10].  Therefore,  we  relax  the 
condition  lm{z)  >  0  to  lm(z)  >  rj  >  0.  By  (2.20)  this  will  lead  to  good  approximate 
convolutions  for  times  T  <  r/-1. 

Lemma  4.4.  There  exists  6  >  0  such  that  for  all  e,  0  <  e  <  1/2  and  rj,  0  <  77  < 
1/2.  the  function  f(z)  =  u0(z)  —  iz +  1/2  admits  an  approximation  g(z)  that  is  a  sum 
of  d  <  6  •  ( log(l/ 77)  +  log  log(l/£))  •  log(l/t)  poles ,  with 

(4-31)  \f(*)-9(*)\<e-\m\, 

provided  Im(z)  >  rj. 

Proof  Note  that  since  u0(z)  has  no  poles.  f(z)  is  given  by  (4.14)  and  satisfies 
(4.15).  Define  intervals 

h  =  ((2J_1  -  (2J  -  1)77)  for  j  =  1 . m.  Im+1  =  ((2m  -  l)r/.oc). 

Now 


(4-32)  m  =  Mz) + 

where  /j  and  f2  are  defined  by  the  formulae 


(4-33)  Mi 


-u 


Im (u0(re  7Tl^2)) 


ir  -f  . 


dr.  f2(z) 


-I 


Im(n0(re  7ri/2)) 


ir  +  . 


dr. 


We  will  now  choose  m  so  that  f2  can  be  ignored  and  then  use  Lemma  3.5  to  approxi¬ 
mate  fx.  Using  (4.11)  and  assuming  m  sufficiently  large  we  have,  for  some  constant  c, 

r  r°° 

(4.34)  1/2(2)!  <  — ■■■■  /  re~2rdw  <  -fy- 2 

1  +  \*\  /(2">-l),  1  +  M 

Hence,  choosing 


(4.35)  m  >  mo(log(l/r))  +  loglog(l/£)) 

for  appropriate  m0  independent  of  77  and  £  leads  to 

(4-36)  1/2(2)!  <  £-\m\. 
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Finally,  we  apply  Lemma  3.5  to  the  approximation  of  f\.  Using  p  poles  for  each  j  we 
produce  a  p  •  m-pole  approximation  g(z)  with  an  error  estimate  for  lm(z)  >  r\  given 
bv 


(4-37)  \fi(*)-g(z)\<  y~rr\fi (*)l- 

A  choice  of 

(4.38)  p  =  0^1og^ 

enforces 

(4-39)  \fi(z)-g(z)\<E-\f(z)\. 

By  (4.36).  (4.39).  and  the  triangle  inequality,  (4.31)  is  achieved  with  the  number  of 
poles,  d  =  p  •  m,  satisfying  the  stated  bound.  □ 

We  now  consider  the  contribution  of  the  poles. 

Lemma  4.5.  There  exist  constants  C0,  C\,  6  >  0  such  that  for  all  v,e  e  R  with 
2  <  \v\  and  0  <  e  <  1/2  the  function 


(4.40) 


N» 


AM  =  Err 


hi/,n 


n=  1 


where  ....  are  the  roots  of  Hil\  satisfies  the  inequalities 


(4.41) 


CM 


<  \Kz)\  < 


C2M 


i  +  N/M  “  1  w  i  +  M/M' 

and  admits  an  approximation  g(z)  that  is  a  sum  of  d  <  6  ■  log \u\  ■  log(l/e)  poles,  with 
(4'4~)  \h(z)  -  g(z)\  <  e  ■  \h(z)\, 


provided  Im(2)  >  0. 

Proof.  The  curve  C  defined  in  Lemma  3.2,  near  which  huA/ |i/|, ....  hu,NJ\u\  lie, 
is  contained  in  disks  separated  from  the  real  axis.  If  we  denote  the  disk  of  radius  r 
centered  at  c  by  D(r.c),  then  the  disks 


(4.43)  {£)(— Im(z),  z)|  2  €  C,  |  arg  2  -  tt/2|  =  7r/2  +  7r/2n,  n=  1,2,...  }, 

for  example,  contain  C\{+1,-1}.  From  (3.31),  the  root  hvA  closest  to  the  real  axis 
satisfies 


(4-44)  arg hvA  ~  2/3; 

hence  it  is  contained  in  a  disk  of  (4.43)  with  n  %  log2  (24/33- 1/27r(— o-i )“ 1  ji/|2/3) ,  and 
all  of  the  roots  are  contained  in  O(log  |i/|)  of  the  disks.  Now  applying  Lemma  3.4  we 
obtain  (4.42)  with  \h[  replaced  by  \H\  =  \  *£  |  Vn| /(z  -  V n)|.  To  obtain  the  upper 
bound  in  (4.41)  for  both  h  and  H  we  note  first  that  it  is  trivial  except  for  \z/u\  ss  1. 
A  detailed  analysis  of  the  roots  as  described  by  Lemma  3.2  shows  that 


(4.45) 


1/3 


M^)!  >  Cj2/Z\v 
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Hence,  for  \z/v\  %  1. 


(4.46) 


<  C\v 


2/3  y^j-2/3 
J  =  1 


<  3Cj//j. 


The  lower  bound  in  (4.41)  is  again  obvious  except  for  \z/u\  %  1.  Then,  however,  we 
note  that 


(4-47)  h(z)  =  uv{z)  -  iz  +  1/2  -  /(*). 

Since,  from  (3.26),  |u„(2)|  =  0(M2/3)  for  \z/u\  %  1  and  |/(s)|  =  0(1)  by  (4.15)  the 
right-hand  side  is  dominated  by  -iz  and  |/i(z)|  =  0(\u\).  ■  □ 

The  combination  of  Theorem  4.1  and  Lemmas  4.3  and  4.5  suffices  to  prove  our 
principal  analytical  result. 

Theorem  4.6.  Given  v o  >  0  there  exists  6  >  0  such  that  for  all  v  e  R.  \v\  > 
and  0  <  £  <  1/2  there  exists  d  with 


(4.48)  d  <  <5( log \v\  ■  log(l/e)  +  log2  \v\  +  \i/\~l  log2(l/e)), 

and  complex  numbers  Oj, . . . ,  and  3\, . . . ,  pd-  depending  on  u  and  e,  such  that  the 
function 


(4.49) 


U,,(z) 


1  V' 

9  +T 


77  —  1 


Z~Pn 


approximates  u,y(z)  with  the  bound 


(4-50)  | M2)  -  U„.c{z) |  <  c  •  \uu(z)\, 

provided  that  Im(2)  >  0.  Furthermore 


(4.51) 


1/2 


Uv(x)  -  Uv,£{x)\ 2dx 


Proof.  We  first  note  the  lower  bound 


(4'52)  + 

For  v  >  0  the  function  is  nonvanishing  and  has  the  correct  asymptotic  behavior,  so 
we  need  only  consider  the  case  of  \v\  large.  The  result  then  follows  from  (3.26).  This 
proves  (4.51)  and  (4.50)  with  uv  replaced  by  u„  -  iz  +  1/2  on  the  right-hand  side. 
From  (3.26)  we  have 

(4-53)  \UAZ)  -  i*  +  1/2|  <  c|^|1/3|u1/(2)|. 

so  that  the  final  result  follows  from  the  scaling  e  — *  \u\~l/2e.  0 

The  number  of  poles  in  (4.48)  required  to  approximate  uu{z)  to  a  tolerance 
c  depends  on  both  e  and  o .  The  asymptotic  dependence  on  e  is  proportional  to 
M_1  log  (l/£)-  We  will  see  in  the  numerical  examples,  however,  that  this  term  is  im¬ 
portant  only  for  small  \v\\  otherwise  the  dominant  term  is  the  first,  for  an  asymptotic 
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dependence  of  0(log  |i/|  ■  log(l/s)).  As  we  generally  have  5  -c  \v\~l  in  practice,  the 
term  log"  ji^|  is  of  less  importance. 

Similarly.  Lemma  4.4  leads  to  the  following  theorem  for  v  =  0. 

THEOREM  4.7.  There  exists  6  >  0  such  that  for  all  £.  0  <  r  <  1/2  and 
0  <  r]  <  1/2  there  exists  d  <  6  •  (log(l/7?)  •  log(l/=)  +  loglog(l/c)  +  loglog(l/7?)) 
and  complex  numbers  a1( . . . ,  ad  and  ft, ... ,  ft,  depending  on  p  and  e,  such  that  the 
function 


(4.54) 


U0. 


n=  1 


Qln 


-ft 


approximates  uq(z)  with  the  bound 


(4-55)  M2)-^o.«W|<e-|uo(2)|, 

provided  that  Im(z)  >  77.  Furthermore 


(4.56)  \u0(x  +  if])  -  U0.‘ 


1/2 


(x  +  irj)\2  dx 


<  E 


r cc  9  \  1/2 

J  [^(x  +  ip)  -  ix  4-  T)  4-  l/2\2dx  j 


Proof  Again  we  already  have  (4.55)  with  u0(z)  -  iz  +  1/2  on  the  right-hand  side. 
By  (3.24)  we  find 


(4.57)  \u0(z)  —  iz  +  1/2 1  <  clog(l/77)|u0(z)|. 

The  theorem  follows  from  the  scaling  £  log-1  (1/77)5.  □ 

As  we  must  take  77  =  T  1 ,  we  see  that  the  number  of  poles  required  may  grow 
like  log(  I/5)  -logT  +  log  T  ■  log  log  T.  However,  this  is  only  for  the  mode  n  —  0  in  the 
two-dimensionsal  case.  In  short,  the  T  dependence  is  insignificant  in  practice. 

5.  Computation  of  the  rational  representations.  Analytical  error  bound 
estimates  developed  in  the  previous  sections  are  based  on  maximum  norm  errors 
as  in  (2.19)  and  (2.20).  In  numerical  computation  it  is  often  convenient,  however,  to 
obtain  least  squares  solutions.  Our  method  of  computing  a  rational  function  U^E  that 
satisfies  (4.50)  is  to  enforce  (4.51).  An  alternative  approach  would  be  to  use  rational 
Chebyshev  approximation  as  developed  by  Trefethen  and  Gutknecht  [24,  25,  26]. 

In  the  numerical  computations,  we  work  with 


(5.1) 


Uu(z)  =  uu(z)  -  iz  +  1/2 


and  its  sum-of-poles  approximation  UUs£(z)  —  UUs£(z)  —  iz  +  1/2.  In  particular,  we 
have  the  nonlinear  least  squares  problem 


(5.2) 


min 

P,Q 


P(x) 

Q(x) 


-  U»(x) 


2 

dx 


for  P,  Q  polynomials  with  deg(P)  +  1  =  deg(Q)  =  d.  Problem  (5.2)  is  not  only 
nonlinear,  but  also  very  poorly  conditioned  when  P,  Q  are  represented  in  terms  of 
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their  monomial  coefficients.  We  apply  two  tactics  for  coping  with  these  difficulties: 
linearization  and  orthogonalization. 

We  linearize  the  problem  by  starting  with  a  good  estimate  of  Q  and  updating 
P.Q  iteratively.  In  particular,  we  solve  the  linear  least  squares  problem 


(5.3) 


min 


P(i+l\x) 

QM(x) 


Q{,+1)(x) 

QM(x) 


Uu{x) 


dx. 


where  the  integral  is  replaced  by  a  quadrature.  The  initial  values  P<0).  Qw  are 
obtained  by  exploiting  the  asymptotic  expansion  (3.25)  and  the  recurrence  (3.28). 
We  find  that  two  to  three  iterations  of  (5.3)  generally  suffice. 

The  quadrature  for  (5.3)  is  derived  by  first  changing  variables, 

/oc  />:r/ 2  m 

f(x)dx~  /  /(tan0)  sec2  8  dO  ^  V*  wl  /(tan  6A  sec2 

-oc  J r/2 

where  0i, . . .  ,0m  and  denote  appropriate  quadrature  nodes  and  weights. 

The  transformed  integrand  is  periodic  on  the  interval  [ — 7t/2,  tt/2],  so  the  trapezoidal 
rule  (or  midpoint  rule)  is  an  obvious  candidate.  The  integrand  is  infinitely  continously 
differentiable,  except  at  0  =  0,  where  its  regularity  is  of  order  2\v\.  For  ]v\  >  8  (say), 
the  trapezoidal  rule  delivers  at  least  16th-order  convergence  and  is  very  effective. 
For  small  \v\,  however,  a  quadrature  that  adjusts  for  the  complicated  singularity  at 
6  =  0  is  needed.  Here  we  can  successively  subdivide  the  interval  near  the  singularity, 
aPPlving  high-order  quadratures  on  each  subinterval  (see,  for  example,  [27]). 

The  quadrature  discretization  of  (5.3)  cannot  be  solved  as  a  least  squares  problem 
by  standard  techniques,  due  to  its  extremely  poor  conditioning.  We  avoid  forming  the 
corresponding  matrix;  rather  we  solve  the  least  squares  problem  by  Gram-Schmidt 
orthogonalization.  The  2d  +  1  functions 


(^•5)  uu,  1,  xuv ,  x....,xd  xd  *,  xduu 

are  orthogonalized  under  the  real  inner  product 


(5.6) 


</■ 


•*-£ 


R  e(f(x)g(x)) 

|Q(i)(*)|2 


dx 


to  obtain  the  orthogonal  functions 
(5.7)  gn(x)  =  | 


f  uv(x),  n  =  1, 

T  n  =  2, 

[xffn_2(x)  -  EJmii{4’n-1>  cnj  9n-j(x),  n  =  3, . . . ,  2d  +  1, 


where 

(5.8) 

Now 


_  2 1  ffn  — j  )  i  Tl  —  3,  .  .  .  ,  2 d  “I-  1 , 

(gn-j,9n-j)  t’  3  =  l,...,min{4,n-l}. 


(5.9) 


92d+i  =  -P('+l)  +  u„Q<i+1\ 
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Table  1 

Number  d  of  poles  to  represent  the  Laplace  transform  of  nonreflecting  boundary  kernels  an  and 
u;n,  for  various  values  of  £. 


£  = 

10"6 

CT„ 

wn 

n  d 

n  d 

0  26 

£  = 

10~15 

1  9 

crn 

U,'n 

2  6 

n  d 

n  d 

3-6  5 

0—5  n 

1  41 

7-8  6 

6-8  6 

2  24 

9-  12  7 

9-  12  7 

3  18 

13-  19  8 

13-  19  8 

4  15 

20-  31  9 

20-  31  9 

5  14 

32-  51  10 

32-  51  10 

6  13 

52-  86  11 

52-  86  11 

7-  12  12 

87-  147  12 

87-  147  12 

13-  14  13 

0-13  n 

148-  227  13 

148-  228  13 

15-  16  14 

14-  15  14 

228-  401  14 

229-  402  14 

17-  18  15 

16-  18  15 

402-  728  15 

403-  728  15 

19-  22  16 

19-  21  16 

729-1024  16 

729-1024  16 

23-  26  17 

22-  25  17 

£  — 

10-® 

27-  31  18 
32-  37  19 

26-  30  18 
31-  36  19 

<Jn 

LJn 

38-  45  20 

37-  44  20 

n  d 

n  d 

46-  54  21 

45-  53  21 

0  44 

55-  65  22 

54-  65  22 

1  15 

66-  79  23 

66-  79  23 

2  9 

80-  97  24 

80-  96  24 

3-8  7 

0-7  n 

98-  118  25 

97-  118  25 

9-  10  8 

8-  10  8 

119-  145  26 

119-  144  26 

11-  14  9 

11-  14  9 

146-  177  27 

145-  176  27 

15-  20  10 

15-  19  10 

178-  216  28 

If 77-  216  28 

21-28  11 

20-  28  11 

217-  265  29 

217-  264  29 

29-  41  12 

29-  40  12 

266-  324  30 

265-  324  30 

42-  58  13 

41-  57  13 

325-  397  31 

325-  396  31 

59-  84  14 

58-  83  14 

398-  486  32 

397-  485  32 

85-  123  15 

84-  123  15 

487-  595  33 

486-  594  33 

124-  183  16 

124-  183  16 

596-  728  34 

595-  727  34 

184-  275  17 

184-  275  17 

729-  890  35 

728-  890  35 

276-  418  18 

276-  418  18 

891-1024  36 

891-1024  36 

419-  638  19 

419-  637  19 

639-  971  20 

638-  971  20 

972-1024  21 

972-1024  21 

so  p(2+1)  and  are  computed  from  the  recurrence  coefficients  cnj  by  splitting 

(5.7)  into  even-  and  odd-numbered  parts. 

For  some  applications,  including  nonreflecting  boundary  kernels,  it  is  convenient 
to  represent  P/Q  as  a  sum  of  poles, 


(5.10) 


)=A  Oin 

QW  ^Z~Pn 


We  compute  /?i, . . .  ,Pd  (zeros  of  Q )  by  Newton  iteration  with  zero  suppression  (see, 
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Table  2 

Laplace  transform,  of  cylinder  kernel  <jn  defined  in  (2.13).  approximated  as  a  sum  of  d  poles, 
for  n  =  1. . . . ,  4  and  £  —  10~6. 


Pole  Coefficient 


Re 


Im 


-  0.426478E  -  02 
— 0.416255E  —  01 
— 0. 122665E  +  00 
-0.143704E  +  00 
-0.530662E  -  01 

—  0.863872E  —  02 
-0.961472E  -  03 
-0.721548E  -  04 
-0.250102E-  05 


0.218164E  —  01 
0.860648E  +  00 
-0, 138934E  -f  01 
-0.138934E  +  01 
0.209905E  —  01 
0.232032E  —  03 


-  0.179277E  +  00 
-0.168335E  +  01 
-0.168335E  +  01 
-0.816322E  +  00 
-0.126962E  —  01 


O.OOOOOOE  +  00 
O.OOOOOOE  +  00 
O.OOOOOOE  -f  00 
O.OOOOOOE  +  00 
O.OOOOOOE +  00 
O.OOOOOOE  +  00 
O.OOOOOOE  +  00 
O.OOOOOOE  +  00 
O.OOOOOOE  +  00 


Pole  Location 


Re 


Im 


-0.368403E  4-  01 
-0.205860E  +  01 
-0.118994E  +  01 
-0.717570E  +  00 
-0.423506E  +  00 
-0.223111E  +  00 
-0.103710E  +  00 
-0.409342E  —  01 
-0.1 17156E  —  01 


O.OOOOOOE  +  00 
O.OOOOOOE  +  00 
O.OOOOOOE  +  00 
O.OOOOOOE  -r  00 
O.OOOOOOE  +  00 
O.OOOOOOE  +  00 
O.OOOOOOE  4-  00 
O.OOOOOOE +  00 
O.OOOOOOE  +  00 


O.OOOOOOE  +  00 
O.OOOOOOE  +  00 
0.162069E  +  00 
-0. 162069E  00 

O.OOOOOOE  +  00 
O.OOOOOOE  +  00 


O.OOOOOOE  +  00 
0.129111E  +  01 
-0.129111E  +  01 
O.OOOOOOE  +  00 
O.OOOOOOE  +  00 


-0.333263E  +  01 
-0.162945E  +  01 
-0.125843E  4-  01 
-0.125843E  +  01 
-0.612710E  +  00 
-0.240327E  4-  00 


O.OOOOOOE  4-  00 
O.OOOOOOE  +  00 
0.412637E  4-  00 
-0.412637E  4-  00 
O.OOOOOOE  4-  00 
O.OOOOOOE  4-  00 


-0.309775E  4-  01 
-0.167998E  4-  01 
-0.167998E  +  01 
-0.187260E  4-  01 
-0.950854E  4-  00 


O.OOOOOOE  +  00 
0.130784E  +  01 
-0.130784E  +  01 
O.OOOOOOE  +  00 
O.OOOOOOE  +  00 


-  0.197725E  +  01 
-0.197725E  +  01 
-0.219247E  -f  01 
-0.219247E  +  01 
0.464435E  4-  00 


0.220886E  4-  01 
-0.220886E  +  01 
0.216535E  4-  01 
-0.216535E  4-  01 
O.OOOOOOE +  00 


-0.197861E  +  01 
-0.197861E  +  01 
-0.282304E  +  01 
-0.282304E  +  01 
-0.201159E  +  01 


0.220444E  +  01 
-0.220444E  +  01 
0.382237E  +  00 
-0.382237E  +  00 
O.OOOOOOE  +  00 


for  example,  [28])  by  the  formula 


(5.11) 


f#  +  1)  =  /3+  - 


Q{3n]) 


<W)  ~  E 


k=  1 


OjJ&Y 

0{nj)  ~  0k 


where  /?i, . . . .  /?„_i  are  the  previously  computed  zeros  of  Q. 
computed  by  the  formula  q„  =  P(J3n)/Q'{Qn).  The  derivative 
differentiating  the  recurrence  (5.7). 


Then  qj  , . . . ,  are 
Q'{z)  is  obtained  by 


6.  Numerical  results.  We  have  implemented  the  algorithm  described  in  sec¬ 
tion  5  to  compute  the  representations  of  oy,  and  ujn  through  their  Laplace  transforms. 
Recall  that  for  the  cylinder  kernels,  an,  we  have  v  =  n  while  for  the  sphere  kernels,  u>n , 
we  have  v  =  n  +  1/2.  Table  1  presents  the  sizes  of  the  representations  for  e  =  10+ 
10-8,  and  10-15  in  (4.51).  For  the  cylinder  kernels,  which  are  affected  by  the  branch 
cut,  the  number  of  poles  for  small  n  is  higher  than  for  the  sphere  kernels.  This  dis¬ 
crepancy,  however,  rapidly  vanishes  as  n  increases  and  the  asymptotic  performance 
ensues.  The  log(l/s)  dependence  of  the  number  of  poles  for  n  >  10  is  clear. 

For  e  =  1CT8  we  have  also  computed  the  maximum  norm  relative  errors  which 
appear  in  (2.19)  by  sampling  on  a  fine  mesh.  For  the  cylinder  kernel  with  n  =  0, 
we  expect  an  0(1)  error  in  a  small  interval  about  the  origin  due  to  (4.10).  However* 
eirois  of  less  than  £  are  achieved  for  |s|  >  5  x  10-7.  This  implies  a  similar  accuracy 
in  the  approximation  of  the  convolution  for  times  of  order  106.  For  all  other  cases  the 
maximum  norm  relative  errors  are  of  order  e. 
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Finally.  Table  2  presents  poles  and  coefficients  for  the  cylinder  kernels  for  n  = 
1, ....  4  and  e  =  10-6  to  allow  comparison  by  a  reader  interested  in  repeating  our  cal¬ 
culations.  Note  that  the  pole  locations  are  written  in  terms  of  s  =  zji.  Extensive  ta¬ 
bles  will  be  made  available  on  the  Web  at  http://math.nist.gov/mcsd/Staff/BAlpert. 

Remark.  Our  approximate  representation  of  the  nonreflecting  boundary  kernel 
could  be  used  to  reduce  the  cost  of  the  method  introduced  by  Grote  and  Keller  [8].  The 
differential  operators  of  degree  n  obtained  in  their  derivation  need  only  be  replaced  by 
the  corresponding  differential  operators  of  degree  logn  for  any  specified  accuracy.  It 
is  interesting  to  note  that  in  the  two-dimensional  case,  where  the  approach  of  [8]  does 
not  apply,  the  analysis  described  above  can  be  used  to  derive  an  integrodifferential 
formulation  in  the  same  spirit. 

7.  Summary.  In  this  paper  we  have  introduced  new  representations  for  the 
logarithmic  derivative  of  a  Hankel  function  of  real  order,  that  scale  in  size  as  the 
logarithm  of  the  order.  An  algorithm  to  compute  the  representations  was  presented 
and  our  numerical  results  demonstrate  that  the  new  representations  are  modest  in 
size  for  orders  and  accuracies  likely  to  be  of  practical  interest. 

The  present  motivation  for  this  work  is  the  numerical  modeling  of  nonreflect¬ 
ing  boundaries  for  the  wave  equation,  discussed  briefly  here  and  in  more  detail  in 
[18].  Maxwell’s  equations  are  also  susceptible  to  similar  treatment  as  outlined  in  [29]. 
The  new  representations  enable  the  application  of  the  exact  nonreflecting  boundary 
conditions,  which  are  global  in  space  and  time,  to  be  computationally  effective. 

8.  Appendix:  Stability  of  exact  and  approximate  conditions.  In  this  ap¬ 
pendix,  we  consider  the  stability  of  our  approach  to  the  design  of  nonreflecting  bound¬ 
ary  conditions.  Given  that  we  are  approximating  the  exact  conditions  uniformly,  it 
is  natural  to  expect  that  our  approximations  possess  similar  stability  characteristics. 
This  is,  indeed,  the  case.  Oddly  enough,  however,  the  exact  boundary  conditions 
themselves  do  not  satisfy  the  uniform  Kreiss-Lopatinski  conditions  which  are  neces¬ 
sary  and  sufficient  for  strong  well-posedness  in  the  usual  sense  [30].  This  may  seem 
paradoxical  since  the  unbounded  domain  problem  itself  is  strongly  well-posed.  The 
difficulty  is  that  the  exact  reduction  of  an  unbounded  domain  problem  to  a  bounded 
domain  problem  gives  rise  to  forcings  (inhomogeneous  boundary  terms)  which  live  in 
a  restricted  subspace.  The  Kreiss-Lopatinski  conditions,  on  the  other  hand,  require 
bounds  for  arbitrary  forcings.  In  that  setting,  our  best  estimates  result  in  the  loss  of 
1/3  of  a  derivative  in  terms  of  Sobolev  norms.  In  practice  we  doubt  that  this  fact  is 
of  any  significance,  and  have  certainly  encountered  no  stability  problems  in  our  long 
time  numerical  simulations. 

To  fill  in  some  of  the  details,  consider  a  spherical  domain  H  of  radius  one,  within 
which  the  homogeneous  wave  equation  with  homogeneous  initial  data  is  satisfied.  At 
the  boundary  we  have 


so  -I  \  dUnm  s-  /  s/c^(s)  A 

(^•1)  ~'dr  —  (1  +€n(s))'^  ^  Unm+fem. 

where  e„  =  0  for  the  exact  condition  and  is  uniformly  small  when  we  use  our  approx¬ 
imations.  Here  gnm  is  the  spherical  harmonic  transform  of  an  arbitrary  forcing  g. 
Following  Sakamoto,  we  seek  to  estimate 


(8.2) 


H{u)  = 


■li, an  + 


du  .  ii 2 


o,ao 


dt , 
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where 

(8-3)  ll/lli.n  =  /  (/2  +  IWi2). 

Jn 

while  ||  •  ||o.n  denotes  the  usual  L2  norm.  On  the  boundary,  dft.  we  will  make  use 
of  fractional  Sobolev  norms,  most  easily  defined  in  terms  of  the  spherical  harmonic 
coefficients: 

(8-4)  ll/ll*, n  =  E(1  +  -2)Pl/nm|2. 

n. m 

Strong  well-posedness  would  follow  from  showing  that 

(8-5)  W(u)<c/  |jp(-.  Ollo.an^- 

Jo 

Instead,  we  can  show  that 

T 

(8'6)  H(u)<cj  ||5(--0ll?/3.sn*- 

Jo 

To  prove  this,  let  s  =  iz  and  note  that 
(8-7)  kn(s)  oc  h^^z)  oc  z~1/2Hl1)(z),  v  =  n  + 

Bounded  solutions  within  the  sphere  are  given  by 

(8-8)  unm(r,s)  oc  jn(rz)  oc  (rz)~1/2  Ju(rz). 

Precisely.  setting 

(8-9)  Unm(r.s)  =  Anm(z)(rz)~1/2JLI(rz), 

we  find 

(8-10)  Anm(z)  =  -~z1/2HM(z)6n(z)gnm(z). 

where 

(8-11)  =  (l  -  jenJu(z)(zHW'(z)  -  1  • 

We  now  estimate  norms  of  the  solution.  First  note  that  the  products  in  the  definition 

of  <57?1  Jv(z)Hil\z),  (2),  are  uniformly  bounded  for  Im(z)  >  0.  (See  the 

limits  x.  >  0.  z  *  oc,  and  u  >  oc.)  Therefore,  as  mentioned  above,  the  error  term, 
so  long  as  it  s  small,  has  no  effect  on  the  estimates  we  derive,  and  we  simply  ignore 
it.  That  is.  we  set  6n  =  1. 

We  concentrate  on  the  boundary  terms  in  H,  as  they  are  both  the  most  straight¬ 
forward  to  compute  and  the  most  ill  behaved.  In  transform  space  we  have 

(8.12)  (1  +  n2)|unm(l.  s)|2  +  |s<m(l,s)|2  <  c72(z)|pnm(;)|2, 
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(8.13)  7 l{z)  =  I^WI2  (v2\Mz)\2  +  \z\2\J'v{z) |2)  • 

(Here  and  throughout,  c  will  denote  a  positive  constant  independent  of  all  variables.) 
We  first  note  that  as  the  only  singularities  of  Bessel  functions  occur  at  zero  and 
infinity,  we  need  only  consider  the  limits  z  — >  0.  z  — >  oc,  and  v  — >  oc.  The  first  two 
are  straightforward: 

(8.14)  7 l{z)  *  cT2{v){z/2)-2u  (iy2(z/2)2u/T2(u  +  1))  =  c,  z  -  0. 


(8-15)  72(2)  ~  c|z|  1  (i^l^l  1  cos2  2  4-  |z|sin2  z)  «  csin2  z,  z -*  oc. 

For  large  v  we  use  the  uniform  asymptotic  expansions  of  Bessel  functions  due  to  Olver 
[20],  which  yield 

(8-16)  suP72(2)  =  0(i/2^3). 


From  Parseval's  relation,  we  conclude  that 


(8.17) 


du  , 


o.an 


dt  <  c 


f 


The  estimation  of  the  spatial  integrals  is  more  involved,  as  for  r  <  1  the  solution  has 
two  transition  zones,  z  %  v  and  rz  %  v,  and  there  are  a  number  of  cases  to  consider. 
However,  the  estimates  follow  along  the  same  lines  and  lead  to  the  same  result. 

It  is  interesting  to  note  that  the  loss-of-derivative  phenomenon  is  suppressed  when 
one  looks  at  the  error  due  to  the  approximation  of  the  boundary  condition.  In  that 
case  the  transform  of  the  exact  solution  near  the  boundary  is 


(8.18) 


tf’M. ,, . 


so  that  the  error,  e,  satisfies  the  problem  above  with  gnm  given  by 

Zh(n]  (z)  . 


(8.19) 


Qnm  —  ‘ 


h{n\z) 


,(1,5)  —  €nlin(z)unm(l1  s). 


Now  the  best  estimate  of  takes  the  form 


(8-20)  |Mn|  <  C(\z\  +  I/), 

which,  in  combination  with  (8.6),  would  lead  to  an  estimate  of  the  1-norms  of  the 
error  in  terms  of  the  4/3- norms  of  the  solution.  However,  using  again  the  large  u 
asymptotics,  a  direct  calculation  shows 


(8.21) 


lMn7l/|  <  c(\z\  +  u). 


Thus  /j.n  is  smaller  than  its  maximum  by  0(i/~1/3)  in  the  transition  region  where 
=  0(v1/3).  Hence  we  find  for  the  error 


(8.22) 


7f(e)  <  csup  |e„ 


llu(‘>i)lli.an  + 


du 

at 


o,dn 


dt. 
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In  other  words,  the  1-norms  of  the  error  are  controlled  by  the  1-norms  of  the  solution. 

We  have,  of  course,  ignored  discretization  error,  which  could  conceivably  cause 
difficulties  in  association  with  the  lack  of  strong  well-posedness.  To  rule  them  out 
would  require  a  more  detailed  analysis.  In  practice  we  have  encountered  no  difficulties, 
even  for  very  long  time  simulations.  We  should  also  note  that  strong  well-posedness 
could  be  artificially  recovered  by  perturbing  the  approximate  conditions  for  large  n. 
allowing  high  accuracy  to  be  maintained  for  smooth  solutions.  Finally,  we  note  that 
a  similar  analysis  can  be  carried  out  in  two  dimensions. 
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GENERALIZED  GAUSSIAN  QUADRATURES  AND  SINGULAR 
VALUE  DECOMPOSITIONS  OF  INTEGRAL  OPERATORS* 

N.  YARVINt  AND  V.  ROKHLINt 

Abstract.  Generalized  Gaussian  quadratures  appear  to  have  been  introduced  by  Markov  late 
in  the  last  century  and  have  been  studied  in  great  detail  as  a  part  of  modern  analysis.  They  have 
not  been  widely  used  as  a  computational  tool,  in  part  due  to  an  absence  of  effective  numerical 
schemes  for  their  construction.  Recently,  a  numerical  scheme  for  the  design  of  such  quadratures  was 
introduced  by  Ma  et  al.;  numerical  results  presented  in  their  paper  indicate  that  such  quadratures 
dramatically  reduce  the  computational  cost  of  the  evaluation  of  integrals  under  certain  conditions. 
In  this  paper,  we  modify  their  approach,  improving  the  stability  of  the  scheme  and  extending  its 
range  of  applicability.  The  performance  of  the  method  is  illustrated  with  several  numerical  examples. 

Key  words,  quadratures,  singular  value  decompositions,  Chebyshev  systems,  fast  algorithms 
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1.  Introduction.  Generalized  Gaussian  quadratures  appear  to  have  been  intro¬ 
duced  by  Markov  [11,  12]  late  in  the  last  century.  More  recent  expositions  include 
those  by  Krein  [9]  and  Karlin  and  Studden  [8].  Those  expositions  contain  proofs  of 
the  existence  of  such  quadratures  for  wide  classes  of  functions;  however,  they  do  not 
describe  a  numerical  procedure  for  obtaining  the  quadrature  weights  and  nodes. 

Recently,  a  paper  by  Ma,  Rokhlin,  and  Wandzura  [10]  described  a  numerical 
algorithm  for  obtaining  such  quadratures.  In  [10],  a  version  of  Newton’s  method 
is  introduced  for  the  determination  of  nodes  and  weights  of  generalized  Gaussian 
quadratures.  The  procedure  of  [10]  guarantees  the  convergence  of  the  Newton  algo¬ 
rithm  provided  it  is  started  sufficiently  close  to  the  solution  (whose  existence  is  proven 
in  [11,  9,  8])  and  utilizes  a  continuation  procedure  to  provide  such  starting  points. 
The  present  paper  describes  a  variation  of  that  algorithm,  which  consists  mainly  of 
two  major  changes.  The  first  change  is  that  an  entirely  different  continuation  scheme 
is  used;  with  the  new  continuation  scheme,  the  algorithm  is  considerably  more  robust. 
The  second  change  is  the  addition  of  a  preprocessing  step  which,  given  as  input  a  large 
class  of  functions,  uses  the  singular  value  decomposition  (SVD)  to  produce  a  set  of 
basis  functions  suitable  for  the  algorithm. 

Since  a  substantial  fraction  of  the  algorithm  is  changed,  this  paper  is  written  as  a 
repetition  of  [10],  rather  than  as  a  list  of  changes;  however,  the  portions  dealing  with 
quadratures  for  functions  with  end-point  singularities  axe  omitted. 

This  paper  is  organized  in  the  following  manner.  Section  2  summarizes  the  neces¬ 
sary  material  from  [9]  and  [8].  Section  3  briefly  describes  certain  standard  numerical 
tools  used  by  the  algorithm.  Section  4  contains  various  analytical  results  to  be  used  in 
the  construction  of  the  algorithm.  Section  5  describes  the  algorithm  in  detail.  Finally, 
section  6  contains  severed  numerical  examples. 
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2.  Mathematical  preliminaries. 

2.1.  Chebyshev  systems. 

DEFINITION  2.1.  A  sequence  of  functions  <j>x,. . .  ,4>n  will  be  referred  to  as  a 
Chebyshev  system  on  the  interval  [a,  b ]  if  each  of  them  is  continuous  and  the  deter¬ 
minant 


(1) 


4>\{x  i)  <f>  i(xn) 

^n(^l)  '  *  *  ^n(^n) 


is  nonzero  for  any  sequence  of  points  xx, . . . ,  xn  suc/i  that  a  <  xx  <  x2  •  •  •  <  xn  <  6. 

An  alternate  definition  of  a  Chebyshev  system  is  that  any  linear  combination  of 
the  functions  with  nonzero  coefficients  should  have  no  more  than  n  zeros. 

A  related  definition  is  that  of  an  extended  Chebyshev  system. 

DEFINITION  2.2.  Given  a  set  of  functions  <f)n  which  are  continuously 

differentiable  on  an  interval  [a,  b],  and  given  a  sequence  of  points  xx, . . .  ,xn  such  that 
a  <  xx  <  x2  <  •  •  •  <  xn  <  b,  let  the  sequence  mi,...,  mn  be  defined  by  the  formulae 

mi  =  0, 

(2)  ^  f  &nd  Xj  ^  Xj—i, 

'  mj  —  j  -l  if  j  >  1  and  Xj  =  Xj„x  =  •  ■  •  =  xXy 

mj  =  k  if  j  >  k  +  1  and  xj  =  Xj-X  =  •  •  •  =  Xj„k  ^  Xj-k-i- 

Let  the  matrix  C(xx, . . .  ,xn)  =  [dj]  be  defined  by  the  formula 


(3) 


dr"'.  I  ,)' 


tO  i 

in  which  -^r(xj)  is  taken  to  be  the  function  value  (f>i(xj).  Then  <f>x ,...  ,4>n  will  be  re¬ 
ferred  to  as  an  extended  Chebyshev  system  on  [a,  b\  if  the  determinant  \C(xx, . . . ,  xn)| 
is  nonzero  for  all  such  sequences  x*. 

Remark  2.1.  It  is  obvious  from  Definition  2.2  that  an  extended  Chebyshev  sys¬ 
tem  is  a  special  case  of  the  Chebyshev  system.  The  additional  constraint  is  that  the 
successive  points  x*  at  which  the  function  is  sampled  to  form  the  matrix  may  be  iden¬ 
tical;  in  that  case,  for  each  duplicated  point,  the  first  corresponding  column  contains 
the  function  values,  the  second  column  contains  the  first  derivatives  of  the  functions, 
the  third  column  contains  the  second  derivatives  of  the  functions,  and  so  forth;  this 
matrix  must  also  be  nonsingular. 

Examples  of  Chebyshev  and  extended  Chebyshev  systems  include  the  following 
(additional  examples  can  be  found  in  [8]). 

EXAMPLE  2.1.  The  powers  l,x,x2,...,xn  form  an  extended  Chebyshev  system 
on  the  interval  (—00,00). 

Example  2.2.  The  exponentials  e~^lX,  e~^2X, . . . ,  e~~XnX  form  an  extended  Cheby¬ 
shev  system  for  any  \x, . . . ,  An  >  0  on  the  interval  [0,  00). 

EXAMPLE  2.3.  The  functions  l,cosx,sinx,cos2x, sin2x, . . .  ,cosnx,sinnx  form 
a  Chebyshev  system  on  the  interval  [0,27r). 

2.2.  Generalized  Gaussian  quadratures.  The  quadrature  rules  considered 
in  this  paper  are  expressions  of  the  form 


j= 1 


(4) 
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where  the  points  Xj  £  R  and  coefficients  Wj  £  R  are  referred  to  as  the  nodes  and 
weights  of  the  quadrature,  respectively.  They  serve  as  approximations  to  integrals  of 
the  form 


(5) 


where  u>  has  the  form 


(6)  w(x)  =  u>(x)  +  Mi  •  s(x  ~  Xj), 

j= i 


with  m  a  nonnegative  integer,  u  :  [a,  6]  — >  R  an  integrable  nonnegative  function, 
XuX2  points  on  the  interval  [a,  6],  pi,P2, . . . ,  Pm  positive  real  coefficients, 

and  (5  the  Dirac  <5-function  on  R. 

Remark  2.2.  Obviously,  (6)  defines  a;  to  be  a  linear  combination  of  a  nonnegative 
function  with  a  finite  collection  of  5-functions.  In  a  mild  abuse  of  notation,  throughout 
this  paper  we  will  be  referring  to  u>  as  a  nonnegative  function. 

Quadratures  are  typically  chosen  so  that  the  quadrature  (4)  is  equal  to  the  desired 
integral  (5)  for  some  set  of  functions,  commonly  polynomials  of  some  fixed  order. 
Of  these,  the  classical  Gaussian  quadrature  rules  consist  of  n  nodes  and  integrate 
polynomials  of  order  2n  —  1  exactly;  these  quadratures  are  used  in  this  paper  as  a 
numerical  tool  (see  section  3.2).  In  [10],  the  notion  of  a  Gaussian  quadrature  was 
generalized  as  follows. 

DEFINITION  2.3.  A  quadrature  formula  will  be  referred  to  as  Gaussian  with  respect 
to  a  set  of  2 n  functions  1 , . . . ,  02n  •  [a,  b\  — >  R  and  a  weight  function  1 0  :  [a,  b]  — ►  R+ , 
if  it  consists  of  n  weights  and  nodes ,  and  integrates  the  functions  <fi  exactly  with 
the  weight  function  uj  for  all  i  =  1,...,2 n.  The  weights  and  nodes  of  a  Gaussian 
quadrature  will  be  referred  to  as  Gaussian  weights  and  nodes ,  respectively. 

The  following  theorem  appears  to  be  due  to  Markov  [11,  12];  proofs  of  it  can  also 
be  found  in  [9]  and  [8]  (in  a  somewhat  different  form). 

THEOREM  2.1.  Suppose  that  the  functions  <j)  1, . . . ,  <f>2n  •  [a,  b]  — ►  R  form  a  Cheby- 
shev  system  on  [a,  6].  Suppose  in  addition  that  u>  :  [a,  b\  —>  R  is  defined  by  (6),  and 
that  either 


(7) 


>0 


or  m  >  n  (or  both).  Then  there  exists  a  unique  Gaussian  quadrature  for  <f>  1, . . . ,  (f>2n 
on  [a,  b]  with  respect  to  the  weight  function  w.  The  weights  of  this  quadrature  are 
positive. 


2.3.  Total  positivity.  A  concept  closely  related  to  that  of  an  extended  Cheby- 
shev  system  is  that  of  a  extended  totally  positive  (ETP)  kernel. 

DEFINITION  2.4.  Given  a  function  K  :  [a,  b]  x  [c,  d\  — >  R  which  is  n  times 
continuously  differentiable,  and  given  a  sequence  of  points  xi,...  ,xn  such  that  c  < 
^1  <  £2  <  •••  <  xn  <  d,  let  the  sequence  be  defined  by  (2).  Let  the 

functions  $1, . . . ,  </>n  be  defined  by  the  formula 

dm*K 

**(*)« 


(8) 
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in  which  ^~x(xj,t)  is  taken  to  be  the  function  value  K(ij,t).  Then  K  will  be  referred 
to  as  ETP  if  the  functions  <f>\, . . . ,  <f>n  form  an  extended  Chebyshev  system  on  [c,  d\  for 
all  such  sequences  of  X{. 

Examples  of  ETP  kernels  include  the  following  (additional  examples  can  be  found 
in  [8]). 

Example  2.4.  The  function  e~xt  is  ETP  for  x,  t  €  [0,  oo). 

Example  2.5.  The  function  e-*1-^2  is  ETP  for  x,t  e  (-00,00). 

Example  2.6.  The  function  1  /(x  +  t)  is  ETP  for  x,t  £  (0,oo). 

A  proof  of  the  following  lemma  can  be  found  in  [8],  for  example. 

LEMMA  2.2.  Suppose  that  K  and  L  are  ETP  functions  of  two  variables.  Then 
the  function  M  defined  by  the  formula 

rd 

(9)  M(x,t)  =  j  K(x,s)L(s,t)ds 

is  ETP.  In  other  words,  if  the  kernels  of  two  integral  operators  are  ETP,  the  kernel 
of  the  product  of  the  two  operators  is  ETP. 

The  following  theorem  can  be  found  in  [7,  8]. 

Theorem  2.3.  Suppose  that  K  :  [a,  b]  x  [a,  6]  — »  R  is  an  ETP  kernel.  Then  the 
first  p  eigenfunctions  of  the  integral  operator  T  :  L2[a,b }  -*  L2[a,b]  defined  by  the 
formula 


(10) 


K(x,  s)4>{s)ds 


constitute  an  extended  Chebyshev  system  for  any  p  >  1. 

3.  Numerical  preliminaries. 

3.1.  Newton’s  method.  In  this  section  we  discuss  two  well-known  numerical 
techniques:  Newton’s  method  and  the  continuation  method.  A  more  detailed  discus¬ 
sion  of  these  techniques  can  be  found,  for  example,  in  [14]. 

Newton  s  method  is  an  iterative  method  for  the  solution  of  nonlinear  systems  of 
equations  of  the  form  F(x)  =  0,  where  F  :  Rn  -►  Rn  is  a  continuously  differentiable 
function  of  the  form 


(11) 


F(z)  = 


/  /iW 
h  (*) 


\ 


V  Sn{x)  ) 


and  x  =  (xi,..  .,xn)T.  The  method  uses  the  Jacobian  matrix  J  of  F,  which  is  defined 
by  the  formula 


(12) 


J(x)  = 


few  fe(*)\ 

few  •••  few/ 


Lemma  3.1  (Newton’s  method).  Suppose  that  for  some  y  €  Rn, 


(13) 


F(y)  =  0, 
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with  F  :  ®n  —►  Rn  defined  by  (11),  and  that  \J(y)\  ^  0,  with  |J(y)|  denoting  the 
determinant  of  the  matrix  J(x)  defined  in  (12),  evaluated  at  the  point  y.  Given  a 
starting  point  t/o  €  Mn,  let  the  sequence  3/1 , 1/2 » •••be  defined  by  the  formula 

(14)  yk+i  =yk-  J~1(y>c)F(yk)- 

Then  there  exists  a  positive  real  numbers  such  that  for  anyyo  satisfying  the  inequality 
IIj/o  —  2/||  <  £,  the  sequence  (14)  converges  to  y  quadratically;  that  is,  there  exists  a 
positive  real  number  a  such  that 

(15)  \\Vk+i  -  2/||  <  ally*  -  y||2. 

3.1.1.  Continuation  method.  In  order  for  Newton’s  method  to  converge,  the 
starting  point  which  is  provided  to  it  must  be  close  to  the  desired  solution.  One 
scheme  for  generating  such  starting  points  is  the  continuation  method,  which  is  as 
follows. 

Suppose  that  in  addition  to  the  function  F  :  Rn  — >  Rn  whose  zero  is  to  be  found, 
another  function  G  :  [0, 1]  x  Rn  — ►  Rn  is  available  which  possesses  the  following 
properties. 

(i)  For  any  x  €  Rn, 


(16)  G(l,x)  =  F(z). 

(ii)  The  solution  of  the  equation  G(0,x)  =  0  is  known. 

(iii)  For  all  t  €  [0, 1],  the  equation  G(t,x)  =  0  has  a  unique  solution  x  such  that 
the  conditions  of  Lemma  3.1  are  satisfied. 

(iv)  The  solution  x  is  a  continuous  function  of  t. 

If  these  conditions  are  met,  an  algorithm  for  the  solution  of  F(x)  =  0  is  as  follows. 
Let  the  points  U,  for  i  =  1, . . .  ,m,  be  defined  by  the  formula  =  i/m.  Solve  in 
succession  the  equations 


G(tx,x)  —  0, 
G(t2,x)  =  0, 


G(tm^  x)  —  0, 

using  Newton’s  method,  with  the  starting  point  for  Newton’s  method  for  each  equation 
taken  to  be  the  solution  of  the  preceding  equation.  The  solution  x  of  the  final  equation 
G(tm,x)  =  0  is,  by  (16),  identical  to  the  solution  of  the  desired  equation  F(x)  —  0. 
Obviously,  for  sufficiently  large  m,  Newton’s  method  is  guaranteed  by  Lemma  3.1  to 
converge  at  each  step. 

Remark  3.1.  In  practice,  it  is  desirable  to  choose  the  smallest  m  for  which  the 
above  algorithm  will  work,  in  order  to  reduce  the  computational  cost  of  the  scheme. 
On  the  other  hand,  the  largest  step  ( ti  —  £t_i)  for  which  the  Newton  method  will 
converge  commonly  varies  as  a  function  of  t.  Thus  the  algorithm  described  in  this 
paper  uses  an  adaptive  version  of  the  scheme. 

3.2.  Gaussian  integration  and  interpolation.  Classical  Gaussian  quadra¬ 
ture  rules  are  a  well-known  numerical  tool  (see,  for  instance,  [14]);  they  integrate 
polynomials  of  order  2n  —  l  exactly  with  respect  to  some  weight  function  and  consist 
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of  n  weights  and  nodes.  A  variety  of  Gaussian  quadratures  were  analyzed  in  the  last 
century,  each  being  defined  by  a  distinct  weight  function.  Of  these,  the  algorithm 
presented  in  this  paper  uses  only  the  Gaussian  quadratures  for  the  weight  function 
u(x)  =  1  on  the  region  of  integration  [—1,1].  These  quadratures  axe  closely  associated 
with  the  Legendre  polynomials;  we  will  refer  to  their  nodes  as  Legendre  nodes. 

Another  numerical  tool  used  in  this  paper  is  polynomial  interpolation  on  Legendre 
nodes.  Interpolation  refers  to  the  following  problem:  given  two  finite  real  sequences 
/i>  •  •  • » fn  €  E  and  Xi, . . . ,  xn  £  [a,  6],  construct  a  function  /  :  [a,  b]  — ►  E  such  that 
f(xi)  =  fi  for  all  z  =  l,...  ,  n.  One  interpolation  scheme  is  polynomial  interpolation, 
in  which  the  interpolating  function  /  is  a  polynomial  of  degree  n  - 1.  As  is  well  known, 
such  a  polynomial  always  exists  and  is  unique.  However,  in  general  two  numerical 
difficulties  arise  with  polynomial  interpolation  using  polynomials  of  high  order.  The 
first  is  that  for  many  sequences  of  points  {xj},  the  values  of  the  interpolating  poly¬ 
nomial  between  the  points  {xj}  are  not  well  conditioned  as  a  function  of  the  values 
{fi}  to  be  interpolated.  The  second  is  that  even  for  those  sequences  of  points  where 
the  computation  of  the  values  of  the  interpolating  polynomial  is  well  conditioned,  the 
computation  of  the  coefficients  of  the  power  series  of  the  interpolating  polynomial  is 
extremely  ill  conditioned. 

As  is  well  known,  these  difficulties  do  not  arise  if  the  points  {xt}  are  taken  to 
be  Chebyshev  nodes  and  the  interpolating  polynomial  is  computed  as  a  series  of 
Chebyshev  polynomials  rather  than  as  a  power  series.  As  the  following  lemma  shows, 
the  difficulties  also  do  not  arise  if  the  points  {x*}  are  taken  to  be  Legendre  nodes  and 
the  interpolating  polynomial  is  computed  as  a  series  of  Legendre  polynomials.  The 
lemma  makes  use  of  the  following  properties  of  the  Legendre  polynomials:  first,  that 
the  zth  Legendre  polynomial  P*  has  degree  i\  second,  that  the  polynomials  P*  form 
an  orthonormal  system  of  functions  on  [-1,1]. 

Lemma  3.2.  Suppose  that  Xi,...,xn  £  [—1,1]  are  the  Legendre  nodes  of  order 
n,  and  that  w\ £  E  are  the  associated  Gaussian  weights.  Given  a  sequence 
£  E,  let  p  :  [—1,1]  — *  E  be  the  interpolating  polynomial  of  degree  n  —  1 
such  that  p(xi)  =  fi  for  all  i  =  1, . . . ,  n,  and  let  Cq,.  ,  Cn-\  be  the  coefficients  of  the 
Legendre  series  of  p;  that  is, 

n—  1 

(!7)  p(x)  =  J2ciPi(x), 

i= 0 

where  Pi(x)  is  the  ith  Legendre  polynomial.  Then  the  following  relation  holds: 

(18)  =  j  p{x)2dx  =  '^2cf. 

i= 1  •/~1 

Proof  The  second  equality  of  (18)  follows  from  (17)  and  the  orthonormality  of  the 
Legendre  polynomials.  The  first  equality  may  be  proven  as  follows:  the  polynomial  p 
has  degree  n  —  1;  thus  its  square  has  degree  2n  —  2.  Since  the  Gaussian  quadrature 
integrates  exactly  all  polynomials  up  to  order  2n  -  1,  it  integrates  p 2  exactly;  thus 
the  first  equality  of  (18)  holds.  □ 

3.3.  Singular  value  decomposition.  The  singular  value  decomposition  (SVD) 
is  a  ubiquitous  tool  in  numerical  analysis,  which  is  given  for  the  case  of  real  matrices 
by  the  following  lemma  (see,  for  instance,  [3]  for  more  details). 

Lemma  3.3.  For  any  nx  m  real  matrix  A,  there  exists  an  n  x  p  real  matrix  U 
with  orthonormal  columns,  an  m  x  p  real  matrix  V  with  orthonormal  columns,  and  a 
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pxp  real  diagonal  matrix  S  =  [$ij]  whose  diagonal  entries  are  nonnegative ,  such  that 
A  =  USV *  and  that  sa  >  Si+1,1+1  for  all  i  =  1, . . .  ,p  —  1. 

The  diagonal  entries  Sa  of  S  are  called  singular  values;  the  columns  of  the  matrix 
V  are  called  right  singular  vectors;  the  columns  of  the  matrix  U  are  called  left  singular 
vectors. 


3.4.  Singular  value  decompositions  of  integral  operators.  This  section, 
which  follows  [5],  contains  an  existence  theorem  for  a  factorization  of  integral  opera¬ 
tors.  The  operators  T  :  L2[c,  d\  — ►  L2[a,b\  to  which  it  applies  are  of  the  form 

(19)  (T/)(x)  =  jf*  K(x,t)f(t)dt, 

in  which  the  function  K  :  [a,  b]  x  [c,  d\  — +  R  is  referred  to  as  the  kernel  of  the  operator 
T.  Throughout  this  section,  it  will  be  assumed  that  all  functions  are  square  integrable; 
the  term  “norm”  will  mean  the  L 2  norm. 

The  following  theorem,  which  defines  the  factorization,  is  proven  in  a  more  general 
form  as  Theorem  VI.  17  in  [13]. 

THEOREM  3.4.  Suppose  that  the  function  K  :  [a,  b]  x  [c,  d\  — >  R  is  square  inte¬ 
grable.  Then  there  exist  two  orthonormal  sequences  of  functions  Ui  :  [a,  b]  — >  R  and 
Vi  :  [c,  d\  — »  R  and  a  sequence  S{  €  R,  for  i  =  1, . . . ,  oo,  such  that 

(20)  K(x,t)  ='^2Ui  ( x )  sivi  (0 

»  =  1 

and  that  $i  >  S2  >  •  •  ■  >  0.  The  sequence  Si  is  uniquely  determined  by  K .  Further¬ 
more ,  the  functions  Vi  are  eigenfunctions  of  the  operator  T*T,  where  T  is  defined  by 
(19),  and  the  values  Si  are  the  square  roots  of  the  eigenvalues  ofT*T. 

By  analogy  to  the  finite-dimensional  case,  we  will  refer  to  this  factorization  as 
the  singular  value  decomposition.  We  will  refer  to  the  functions  U{  as  left  singular 
functions  of  K  (or  of  T),  to  as  right  singular  functions,  and  to  S{  as  singular  values. 

As  is  the  case  for  the  discrete  SVD,  this  decomposition  can  be  used  to  construct 
an  approximation  to  the  function  K  by  discarding  small  singular  values  and  the 
associated  singular  functions: 

(21)  K(x,t)  ~  ^2ui(x)siVi(t). 

i=l 

The  error  of  this  approximation  can  then  be  computed  from  (20): 

P  oo 

(22)  K{x,t)  -^2ui(x)siVi(t)  =  ^  Ui(x)siVi(t), 

*=1  »=p+l 

and,  therefore, 


(23) 

V 

oo 

E  si 

i— 1 

\ 

i=P+ 1 

Using  (21),  integrals  of  the  form 


(24) 
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can  be  approximated  by  the  formula 


(25) 


fD  *0  V 

/  K(x,t)u{x)dx  ~  /  Uj(x)sivi(t)tjj(x)dx 

J  a  Ja  l=^ 

P  ,b 

/  Ui(x)uj(x)dx. 

»= 1  ■'° 


Thus  a  quadrature  which  is  exact  for  each  of  the  integrals 

rb 

(26) 


/  Wi(x)cj(x)dx, 
Ja 


for  i  =  1, . . .  ,p,  is  an  approximate  quadrature  for  integrals  of  the  form  (24). 

Many  properties  of  the  singular  functions  of  an  integral  operator  can  be  deduced 
from  the  corresponding  properties  of  eigenfunctions  of  integral  operators;  a  property 
of  concern  in  this  paper  is  that  of  forming  an  extended  Chebyshev  system  and  is 
addressed  by  the  following  theorem. 

THEOREM  3.5.  Suppose  that  K  :  [a,  b]  x  [c,  d\  — *  R  is  ETP.  Then  the  first  p  left 
singular  functions  of  K  form  an  extended  Chebyshev  system  for  any  p;  likewise  the 
first  p  right  singular  functions  of  K  form  an  extended  Chebyshev  system  for  any  p. 

Proof.  Let  the  integral  operator  T  :  L2[c,  d\  — >  L2[a,6]  be  defined  by  the  formula 


(27) 


(Tf)(x)  =  £ 


d 


K(x,t)f(t)dt , 


and  let  the  function  L  :  [a,  b]  — >  [a,  b ]  be  defined  by  the  formula 


(28) 


K(x,  s)K(t,  s)ds. 


Clearly,  the  integral  operator  S  :  L2\a,b]  —*  L2[a,b]  defined  by  the  formula  S  =  T*T 
has  the  kernel  L: 


(29) 


m(x)  =  a:  K(x,s)K{t,  s)d$<f>(t)dt 

=  f  L(x,  t)(f)(t)dt. 

Ja 


Since  K  is  ETP,  L  is  also  ETP,  due  to  Lemma  2.2.  Thus  by  Theorem  2.3,  the 
eigenfunctions  of  S  constitute  an  extended  Chebyshev  system.  By  Theorem  3.4,  these 
eigenfunctions  are  identical  to  the  left  singular  functions  of  T,  which  proves  that  the 
first  p  left  singular  functions  of  T  constitute  an  extended  Chebyshev  system  for  any 
p.  The  proof  for  the  right  singular  functions  is  identical.  □ 

4.  Analytical  apparatus. 

4.1.  Convergence  of  Newton’s  method.  In  this  section,  we  observe  that  the 
nodes  and  the  weights  of  a  Gaussian  quadrature  satisfy  a  certain  system  of  nonlinear 
equations.  We  then  prove  that  the  Newton  method  for  this  system  of  equations  is 
always  quadratically  convergent,  provided  the  functions  to  be  integrated  constitute 
an  extended  Chebyshev  system. 
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Given  a  set  of  functions  <f> i, . . . ,  <f>2n  and  a  weight  function  w,  the  Gaussian  quadra¬ 
ture  is  defined  by  the  system  of  equations 


n  [b 

Y^wj<t>l(xj)=  j  <j>i (x)u(x)dx, 

j=i  Ja 

*  rb 

^Wjfoixj)  =  /  <t>2(x)u{x)dx , 

j=l  Ja 


(30) 


4>2n{x)u>{x)dx 


(see  Definition  2.3).  Let  the  left-hand  sides  of  these  equations  be  denoted  by  f\ 
through  f2n  ■  Then  each  fi  is  a  function  of  the  weights  w\  ,...,wn  and  nodes  xi , . . . ,  xn 
of  the  quadrature.  Its  partial  derivatives  are  given  by  the  obvious  formulae 


(31) 


Oh 

dwi 


(32) 


dfk 

dxi 


Wi4>'k(Xi). 


Thus  the  Jacobian  matrix  of  the  system  (30)  is 


(33) 


(4>  l{xl)  4>l(Xn) 

4>2n{p'\)  ‘  '  *  4>2n{xn) 


™1^2n(Xl) 


Wn4>[(Xn) 

/Wn4>2niXn) 


Lemma  4.1.  Suppose  that  the  functions  . . . ,  <f>2n  form  an  extended  Chebyshev 
system.  Let  the  Gaussian  quadrature  for  these  functions  be  denoted  by  Wi  and  X{. 
Then  the  determinant  of  J  is  nonzero  at  the  point  which  constitutes  the  Gaussian 
quadrature;  in  other  words ,  |  J(ii, . . .  . . .  ,iun)|  ^  0. 

Proof  It  is  immediately  obvious  from  (33)  that 


(34) 


|J(xi,...,Xn,t&i,...,l&n)|  =  ^t>x  .  zt>2 . U>n- 1 

4>l(xl)  ...  <j>  i(fn)  ^i(^l)  ***  <f>l(xn) 

<t>2n{*x)  •••  <f>2n{Xn)  <t>2n{xl)  <t>2  n(xn) 


If  ... ,  <j>2n  form  an  extended  Chebyshev  system,  then  by  Theorem  2.1  the  weights 
, . . . ,  wn  of  the  Gaussian  quadrature  are  positive.  In  addition,  by  the  definition 
of  an  extended  Chebyshev  system,  the  determinant  in  the  right-hand  side  of  (34)  is 
nonzero.  Thus 


(35)  |J(ii,...,xn,u)i,...,u)n)|  ^  0.  □ 

Using  the  inverse  function  theorem,  we  immediately  obtain  the  following  corollary. 

COROLLARY  4.2.  Under  the  conditions  of  Lemma  4.1,  the  Gaussian  weights  and 
nodes  depend  continuously  on  the  weight  function. 
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4.2.  Interpolation.  This  section  contains  two  basic  lemmas  about  interpola¬ 
tion.  The  following  lemma  shows  that  any  interpolation  scheme  on  an  interval  [a,  6] 
whose  output  depends  linearly  on  its  input  is  characterized  by  a  finite  sequence  of 
functions  [a,  b]  — ►  R. 

LEMMA  4.3.  Suppose  L  :  Rn  -+  L2[a,6]  is  an  interpolation  scheme  with  n  nodes 
x\ , . . .  ,x„  G  [a, b]y  and  that  L  is  a  linear  mapping .  Then  there  exists  a  sequence  of 
functions  Qi,...,an  :  [a, b]  — >  R  such  that  for  any  vector  f  G  Rn,  with  elements 
/=(/!,.•  .,/«)T, 

(36)  (£/)(x)  =  £/*<*(*) 

i=l 


/or  all  x  G  [a,  6]. 

Proof.  Let  the  vectors  ei,...,en  G  R"  with  elements  et  =  (ea, . . .  ,ein)T  be 
the  standard  basis  in  Rn;  that  is,  e«  =  1  for  all  t  =  and  etj  =  0  for  all 

*»/  =  1  such  that  i  ^  j.  Let  the  functions  a,, . . .  ,an  :  [a, 6]  — >  R  be  defined 

by  the  formula  Qi  =  Le,.  Since  the  interpolation  scheme  L  is  linear,  for  any  vector 
/  G  Rn  with  elements  /  =  (A, . . . ,  /„)T,  and  for  any  point  x  e  [a, 6], 


(37) 


(Lf)(x)  = 


=  ^/i(Lei)(x) 

t=l 


n 


=  53 -W1)- 


In  the  case  of  polynomial  interpolation,  the  functions  a*  are  referred  to  as  Lagrange 
polynomials;  by  analogy  to  that  case,  we  will  in  general  refer  to  the  functions  a*  as 
the  Lagrange  functions  of  the  interpolation  scheme. 

The  following  lemma  provides  an  error  bound  for  approximation  of  a  function  of 
two  variables  using  two  one-dimensional  interpolation  formulae,  expressed  in  terms 
of  error  bounds  for  each  one-dimensional  interpolation  scheme  applied  separately.  Its 
proof  is  an  exercise  in  elementary  analysis  and  is  omitted. 

Lemma  4.4.  Suppose  that  xi,  x2t . . .  ,xn  G  [a,  6]  and  tx,  . . . ,  tm  G  [c,d]  are  two 
finite  real  sequences,  and  thatai,a2, ...  ,a„  :  [a,  b]  ->  R  and/3ltp2,  ■  •  •  :  [c,d]  -»  R 

are  two  sequences  of  bounded  functions.  Suppose  further  that  L\  :  Rn  — »  L°°  [a,  b]  is 
an  interpolation  formula  with  the  nodes  x\, . . . ,  xn  and  Lagrange  functions  Qj, . . . ,  an, 
and  L2  '■  Rm  —>  L°°[c,d\  is  an  interpolation  formula  with  the  nodes  and 

Lagrange  functions  /?i, . . .  ,/?m.  Suppose  that  T]  G  R  is  such  that 


(38)  X>‘(x)l<J? 

i=  1 


for  all  x  G  [a,  b],  or 


Ei&wK’? 

j=i 


(39) 
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for  all  t  £  [c,  d].  Finally,  suppose  that  K  is  a  function  [a,  6]  x  [c,d]  — ►  R,  and  that  for 
all  x  £  [a,  6]  and  t  £  [c,d], 

n 

(40)  <£ 

i- 1 

and 

m 

(41)  K(x,t)-J2K(x,tj)Pj(t)  <£ 

i=i 

Then 

n  m 

(42)  K(x,t)  -  *>,(*)&(*)  <£(1  +  7?) 

i=l  j  =  l 

/or  a//  x  €  [a,  6]  and  t  €  [c,d]. 

4.3.  Approximation  of  SVD  of  an  integral  operator.  This  section  describes 
a  numerical  procedure  for  computing  an  approximation  to  the  SVD  of  an  integral 
operator. 

The  algorithm  uses  quadratures  which  possess  the  following  property. 
DEFINITION  4.1.  We  will  say  that  the  combination  of  a  quadrature  and  an  in¬ 
terpolation  scheme  preserves  inner  products  on  an  interval  [a,  b]  if  it  possesses  the 
following  properties. 

(i)  The  nodes  of  the  quadrature  are  identical  to  the  nodes  of  the  interpolation 
scheme. 

(ii)  The  function  which  is  output  by  the  interpolation  scheme  depends  in  a  linear 
fashion  on  the  values  input  to  the  interpolation  scheme. 

(iii)  The  quadrature  integrates  exactly  any  product  of  two  interpolated  functions; 
that  is,  for  any  two  functions  /,  g  :  [a,  b]  — ►  R  produced  by  the  interpolation  scheme, 
the  integral 

(43)  J  f(x)g(x)dx 
is  computed  exactly  by  the  quadrature. 

Quadratures  and  interpolation  schemes  which  possess  this  property  include  the 
following. 

EXAMPLE  4.1.  The  combination  of  a  (classical)  Gaussian  quadrature  at  Legendre 
nodes  and  polynomial  interpolation  at  the  same  nodes  preserves  inner  products,  since 
polynomial  interpolation  on  n  nodes  produces  an  interpolating  polynomial  of  order 
n  —  1,  the  product  of  two  such  polynomials  is  a  polynomial  of  order  2 n  —  2,  and  a 
Gaussian  quadrature  integrates  exactly  all  polynomials  up  to  order  2n  —  1 . 

EXAMPLE  4.2.  If  an  interval  is  broken  into  several  subintervals,  and  a  quadrature 
and  interpolation  scheme  which  preserves  inner  products  is  used  on  each  subinterval, 
then  the  arrangement  as  a  whole  preserves  inner  products  on  the  original  interval. 
(This  follows  directly  from  the  definition.) 

EXAMPLE  4.3.  The  combination  of  the  trapezoidal  rule  on  the  interval  [0, 2rr]  and 
Fourier  interpolation  (using  the  interpolation  functions  1,  cosx,  sinx;  cos2x;  sin2x, 

. . cos  nx,  sin  nx)  preserves  inner  products. 
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The  algorithm  takes  as  input  a  function  K  :  [a,  b\  x  [c,  d]  — ►  R.  It  uses  the  following 
numerical  tools: 

(i)  A  quadrature  and  an  interpolation  scheme  on  the  interval  [a,  b]  which  pre¬ 

serve  inner  products.  Let  the  weights  and  nodes  of  this  quadrature  be  denoted  by 
wi  >  •  •  • » wn  e  R  xi  >  •  •  m  €  [a,  6],  respectively.  Let  the  Lagrange  functions  (see 

section  4.2)  of  the  interpolation  scheme  be  denoted  by  ax, . . . ,  an  :  [a,  6]  — ►  R. 

(ii)  A  quadrature  and  an  interpolation  scheme  on  the  interval  [c,  d]  which  pre¬ 
serve  inner  products.  Let  the  weights  and  nodes  of  this  quadrature  be  denoted  by 

€  R  and  <i,  — , tm  €  [c,d],  respectively.  Let  the  Lagrange  functions  of 
the  interpolation  scheme  be  denoted  by  /?i, . . . ,  /?m  :  [c, d]  — >  R. 

As  will  be  shown  below,  the  accuracy  of  the  algorithm  is  then  determined  by  the 
accuracy  to  which  the  above  two  interpolation  schemes  approximate  K. 

The  output  of  the  algorithm  is  a  sequence  of  functions  tti,...,up  :  [a, b]  — >  R, 
a  sequence  of  functions  Vx,...,t/p  :  [c,d]  — ►  R,  and  a  sequence  of  singular  values 
sx, . . . ,  sp  €  R,  which  form  an  approximation  to  the  SVD  of  K. 

Description  of  the  algorithm. 

(i)  Construct  the  n  x  m  matrix  A  =  [ay]  defined  by  the  formula 

(44)  dij  =  K(Xi,tj)yJwf  •  w). 

(ii)  Compute  the  SVD  of  A  to  produce  the  factorization 

(45)  A  =  USVm, 

where  U  =  [tty]  is  an  n  x  p  matrix  with  orthonormal  columns,  V  =  [t>y]  is  an  m  x  p 
matrix  with  orthonormal  columns,  and  S  is  a  pxp  diagonal  matrix  whose  jth  diagonal 
entry  is  Sj . 

(iii)  Construct  the  n  x  p  matrix  U  —  [uy]  and  the  m  x  p  matrix  V  =  [t)y]  defined 
by  the  formulae 

(46)  Uik  =Uik/y/wf, 

(47)  Vjk  =Vjk/yJwj. 

(iv)  For  any  points  x  <E  [a,  6]  and  t  £  [c,  d],  evaluate  the  functions  uk  :  [a,  6]  — >  R 
and  :  [c,  d]  — >  R  via  the  formulae 

n 

(48)  uk(x)  =Y^Uik-  «»(z), 

«=i 

m 

(49) 

J=1 

for  all  k  =  1,. . .  ,p. 

THEOREM  4.5.  Suppose  that  the  combination  of  the  quadrature  with  weights 
and  nodes  wf , . . . ,  w*  £  R  and  Xi, . . .  ,  xn  £  [a,  b],  respectively,  and  the  interpolation 
scheme  with  Lagrange  functions  ax, . . .  ,an  :  [a,  6]  — ►  R,  preserves  inner  products  on 
[a,  6] . 

Suppose  in  addition  that  the  combination  of  the  quadrature  with  weights  and  nodes 
wi  >  •  •  •  >  wm  ^  R  and  ti, . . . ,  €  [c,  d],  respectively ,  and  ihe  interpolation  scheme  with 

Lagrange  functions  /?i, . . . ,  /?m  :  [c,  d]  — ♦  R,  preserves  inner  products  on  [c,  d]. 

For  any  function  K  :  [a,  6]  x  [c,d]  ->  R,  let  u{  :  [a,  6]  R,  :  [c,d]  ->  R,  and 
€  R  be  defined  in  (44)— (49),  for  all  i  =  1, . . .  ,p.  Then 


GENERALIZED  GAUSSIAN  QUADRATURES  AND  SVDs 


711 


(i)  The  functions  Ui  are  orthonormal,  i.e., 


(50) 


J  Ui{x)uk{x)dx  =  8ik 


for  all  i,  k  =  1, . . .  ,p,  with  6^  the  Kronecker  symbol  (6ij  =  1  if  i  =  j,  0  otherwise). 

(ii)  The  functions  Vi  are  orthonormal ,  i.e., 


(51) 


J ^  Vi(t)vk(t)dx  =  6ik 


for  alli,k  =  1, . . .  ,p. 

(iii)  The  function  K  :  [a,  b]  x  [c,  d]  — *  R  defined  by  the  formula 


(52)  K(x,t)  =  'Y^SjUj{x)vj{t) 

i= i 

zs  identical  to  the  function  produced  by  sampling  K  on  the  grid  of  points  ( Xi,tj ),  then 
interpolating  with  the  two  interpolation  schemes.  That  is, 

n  m 

(53)  K(x,t)  =  '£'£K(xi,tj)ai(x)/3j(t). 

1=1  j  =  l 

Proof.  We  first  prove  (53).  Combining  (48),  (49),  and  (52),  we  have 


P  /  n 


K(x,t)  = 


fc= l  \t=i 

n  m  /  p 


=  YlUk(Xi)SkVk(wX3) 

i=l  \fc=l  / 

n  m  /  p  _  \ 

=  J2  S  ( £(uifc/v^F)sfc(^fc/\/^j) )  oci(x)^(t) 


i= 1  j=l  \jt=l 
n  m  /  p 


(54) 


=  S  S  (  52  UikSkVjk/J wfwij  j  ai(x)/3j(t) 

«=i j=i  \fc=i  / 

n  m 

“US  (a'i/\jwfw])  Qi (*)&■(*)• 


1=1  j=l 


Now  (53)  follows  from  the  combination  of  (54)  and  (44). 

We  now  demonstrate  the  orthonormality  of  the  functions  Since  these  are 
functions  produced  by  interpolation,  and  since  the  quadrature  on  [a,  6]  is  assumed  to 
integrate  exactly  all  products  of  pairs  of  interpolated  functions, 


=  S  Wj  (U3i/\f™j)(U3k/f™j) 

j  =  l 
n 

=  J2U3iUjk- 


(55) 
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Since  the  last  sum  in  (55)  is  the  inner  product  of  two  columns  of  the  orthonormal 
matrix  U  (see  (45)), 

(56)  /  Ui(x)uk(x)dx  =  6ik. 

J  a 

The  orthonormality  of  the  functions  is  proven  in  the  same  manner.  □ 

Remark  4.1.  Obviously,  the  above  proof  approximates  the  SVD  of  the  operator 
T  :  L2[c, d\  — >  L2[a,b]  with  the  kernel  K  by  constructing  an  approximation  T  with 
kernel  K  to  the  operator  T  that  is  of  finite  rank,  and  constructing  the  exact  SVD  of 
the  latter. 

OBSERVATION  4.2.  In  the  preceding  proof ‘  the  assumption  that  each  combination 
of  quadrature  and  interpolation  scheme  preserves  inner  products  was  used  only  to 
demonstrate  the  orthonormality  of  the  corresponding  singular  functions.  Thus  if  the 
conditions  of  Theorem  4.5  hold ,  with  the  exception  that  the  quadrature  on  [a,  6]  does 
not  preserve  inner  products,  then  (51)  and  (53)  hold  (but,  in  general,  (50)  does  not). 

Remark  4.3.  Theorem  4.5  and  Lemma  4.4  generalize  trivially  to  higher  dimen¬ 
sions.  One-dimensional  quadratures  and  interpolation  formulae  have  to  be  replaced 
with  their  multidimensional  counterparts;  otherwise,  the  proofs  are  unchanged. 

5.  Numerical  algorithm.  This  section  describes  a  numerical  algorithm  for  the 
evaluation  of  nodes  and  weights  of  generalized  Gaussian  quadratures.  The  algorithm’s 
input  is  a  sequence  of  functions  0i,...,02n  •  [a,  5]  — ►  R  which  form  an  extended 
Chebyshev  system  on  [a,  6],  and  a  weight  function  u>i  :  [a,  b]  — >  R+.  Its  output  is  the 
weights  and  nodes  of  the  quadrature.  The  main  components  of  the  algorithm  axe  as 
follows  (not  listed  in  order  of  execution). 

(i)  Newton’s  method  is  used  to  solve  (30)  which  defines  the  Gaussian  quadra¬ 
ture. 

(ii)  An  adaptive  version  of  the  continuation  method  (section  3.1.1)  is  used  to 
provide  starting  points  for  Newton’s  method.  The  continuation  scheme  used  here  is 
different  from  that  used  in  [10];  the  details  of  the  continuation  scheme  and  of  the 
method  of  adaption  are  described  below. 

(iii)  The  algorithm  of  section  4.3  can  be  used  as  an  optional  preprocessing  step, 
which  takes  as  input  a  kernel  of  an  integral  operator  and  produces  its  singular  func¬ 
tions.  The  first  2n  of  the  left  singular  functions  are  then  used  as  input  to  the  main 
algorithm. 

5.1.  Continuation  scheme.  The  continuation  scheme  used  is  as  follows.  Let 
the  weight  functions  u> :  [0, 1]  x  [a,  b]  ->  R+  be  defined  by  the  formula 

n 

(57)  u(a,x)  =  awi(i)  +  (1  -  a)^T6(x  -  cj), 

j=i 

where  u>i  is  the  weight  function  for  which  a  Gaussian  quadrature  is  desired,  6  denotes 
the  Dirac  delta  function,  and  the  points  Cj  G  [a,  b]  axe  arbitrary  distinct  points.  These 
weight  functions  have  the  following  properties. 

(i)  With  a  =  1,  the  weight  function  is  equal  to  the  desired  weight  function  cj i, 
due  to  (57). 

(ii)  With  a  =  0,  the  Gaussian  weights  and  nodes  are 

(58) 

(59) 


Wj  =  1, 
Xj  =  Cj , 


GENERALIZED  GAUSSIAN  QUADRATURES  AND  SVDs 


713 


for  j  =  1, . . . ,  n,  whatever  the  functions  fa  are  (since  <j(0,x)  =  0,  unless  x  —  Cj  for 
some  j  G  [l,n]). 

(iii)  The  quadrature  weights  and  nodes  depend  continuously  on  a  (by  Corollary 

4.2). 

The  intermediate  problems  which  the  continuation  method  solves  are  the  Gaus¬ 
sian  quadratures  relative  to  the  weight  functions  cj(a,  *).  The  scheme  starts  by  setting 
a  =  0,  then  increases  a  in  an  adaptive  manner  until  a  =  1,  as  follows.  A  current 
step  size  is  maintained,  by  which  a  is  incremented  after  each  successful  termination 
of  Newton’s  method.  After  each  unsuccessful  termination  of  Newton’s  method,  the 
step  size  is  halved  and  the  algorithm  restarts  from  the  point  yielded  by  the  last  suc¬ 
cessful  termination.  After  a  certain  number  of  successful  steps,  the  current  step  size 
is  doubled.  (Experimentally,  the  current  problem  was  found  to  be  well  suited  to  an 
aggressive  mode  of  adaption:  in  the  authors’  implementation,  the  initial  value  of  the 
step  size  was  chosen  to  be  one,  and  the  step  size  was  doubled  after  a  single  successful 
termination  of  Newton’s  method.) 

5.1.1.  Comparison  to  continuation  method  of  [10].  The  continuation 
method  of  this  paper  differs  from  the  continuation  method  of  [10]  in  that  a  different 
part  of  the  system  of  equations  is  changed  as  a  function  of  the  continuation  variable 
a.  In  [10],  the  thing  changed  is  not  the  weight  function  cj  but  rather  the  functions 
fa  5  •  *  •  >  fan  which  the  quadrature  is  to  integrate  properly.  Each  of  these  functions  is 
altered  according  to  the  formula 

(60)  fa(ot,x)  =  a  fa  (x)  +  (1  -  a)Pi(x), 

where  fa, . . . ,  fan  are  the  functions  for  which  the  quadrature  was  desired,  and  where 
Pi ,  •  •  • ,  Pn  are  some  sequence  of  functions  for  which  a  Gaussian  quadrature  is  known 
(for  instance,  polynomials).  That  continuation  method  has  the  drawback  that  the 
functions  fa,...  ,fan  do  not  necessarily  form  an  extended  Chebyshev  system  when 
0  <  a  <  1,  even  if  the  functions  fa, . . . ,  fan  form  an  extended  Chebyshev  system.  For 
instance,  if  the  quadrature  is  to  integrate  two  functions,  fa  —  P2,  and  fa  —  Pi,  then 
when  a  =  1/2,  the  functions  fa  and  fa  are  identical,  so  the  Jacobian  matrix  (33)  is 
singular,  whatever  the  (single)  quadrature  node  xx  might  be. 

5.1.2.  Starting  points.  The  choice  of  the  points  cj  was  left  indefinite  above. 
In  exact  arithmetic  the  algorithm  would  converge  for  any  choice  of  distinct  points 
(see  Lemma  4.1).  However,  the  number  of  steps  of  the  continuation  method,  and  thus 
the  speed  of  execution,  is  affected  by  the  choice.  More  importantly,  the  numerical 
stability  of  the  scheme  might  be  compromised  due  to  poor  conditioning  of  the  matrix 
J  (see  (33)).  Indeed,  while  Lemma  4.1  guarantees  that  the  matrix  J  is  nonsingular,  it 
says  nothing  about  its  condition  number.  Thus,  in  the  authors’  implementation,  the 
points  Cj  used  for  the  production  of  the  quadrature  of  order  n  were  computed  from 
the  nodes  Xj  of  the  quadrature  of  order  n  —  1  by  the  formulae 


(61) 

=  xx, 

(62) 

Ci  —  (xj_i  H“  Xi)/2, 

*  =  2, . . 

.,n-  1, 

(63) 

Cn  —  Xn__j. 

With  this  choice,  no  failures  to  converge  have  been  encountered  in  the  authors’  expe¬ 
rience. 
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6.  Numerical  examples.  A  variety  of  quadratures  were  generated  to  illustrate 
the  performance  of  the  above  algorithm.  In  each  case  the  preprocessing  step  of  pro¬ 
ducing  singular  functions  was  used.  This  step  requires  two  sets  of  quadratures  and 
interpolation  schemes,  which  must  approximate  the  desired  kernel  to  the  desired  ac¬ 
curacy.  These  quadratures  and  interpolation  schemes  were  chosen  so  that  the  ap¬ 
proximation  was  accurate  to  about  the  precision  of  the  arithmetic  that  was  used. 
The  following  combination  of  quadrature  and  interpolation  scheme  which  preserves 
inner  products  was  used:  the  interval  of  integration  was  divided  into  several  subinter¬ 
vals,  and  a  combination  of  a  (classical)  Gaussian  quadrature  at  Legendre  nodes  and 
polynomial  interpolation  was  used  on  each  subinterval. 

In  each  of  the  following  examples,  the  calculations  were  done  in  extended  precision 
(Fortran  REAL*  16)  arithmetic,  with  the  exception  of  the  last  example,  which  was  done 
in  double  precision  (REAL*8)  arithmetic. 

6.1.  Exponentials.  In  this  example  we  construct  quadratures  for  the  integral 

(64)  r  e~xtdx 

Jo 

under  the  condition  that  1  <  t  <  500.  In  this  case,  the  corresponding  kernel  K  : 
[0,  oo )  x  [1,500]  — ►  R  is  given  by 

(65)  K(xyt)  =  e~xt 

and  is  ETP;  thus  its  singular  functions  form  an  extended  Chebyshev  system.  The 
measured  maximum  absolute  error  of  integration  of  the  produced  quadratures,  over 
the  range  1  <  t  <  500,  is  given,  for  selected  n,  in  the  following  table. 


71 

6  8  14  23  27 

Error 

0.827E— 03  0.726E— 04  0.366E-07  0.356E-12  0.323E-14 

The  weights  and  nodes  of  the  27-point  quadrature  are  included  as  Table  6.1;  the 
remaining  weights  and  nodes  are  available  electronically  at  the  URL 
http : / /vww.netlib . org/pdes/multipole/vts500 . f . 

6.2.  Complex  exponentials.  Here,  we  design  quadratures  for  a  new  version  [5] 
of  the  two-dimensional  fast  multipole  method.  These  quadratures  are  for  the  integral 

JrOO 

1  e~xzdx, 

0 

under  the  condition  that  z  £  C  is  constrained  to  He  in  the  region  D  of  the  complex 
plane  which  consists  of  the  rectangle  [1, 4]  x  [-4, 4]  with  a  1  x  1  square  deleted  from  each 
of  its  two  left-hand  corners,  as  depicted  in  Figure  1.  Since  both  the  true  integral  (equal 
to  1/z)  and  the  quadrature  which  approximates  the  integral  are  complex  analytic 
on  that  region,  due  to  the  maximum  modulus  principle  the  maximum  error  of  the 
quadrature  is  achieved  on  the  boundary  8D  of  the  region.  Accordingly,  the  kernel 
whose  singular  functions  were  computed  was  K{x ,  z)  =  e”x*,  with  z  varying  over  6D. 
A  brief  examination  of  the  resulting  singular  functions  shows  that  they  do  not  form 
a  Chebyshev  system;  if  they  did  so,  the  zth  function  would  have  i  —  1  zeros,  yet  it 
has  many  more.  Thus,  the  algorithm  is  not  guaranteed  to  work;  however,  it  did  so. 
The  measured  maximum  absolute  error  of  integration  of  the  produced  quadratures  is 
given,  for  selected  n,  in  the  following  table. 
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Table  6.1 

27-point  generalized  Gaussian  quadrature  for  decaying  exponentials. 


Node  (xj) 

Weight  (u/j) 

0.5378759010624780E-03 
0.2860176825815242E— 02 
0.7148658617716300E— 02 

0. 1360965515937845E— 01 
0.2257800188133212E— 01 
0.3456421989535069E— 01 
0.5032042618508775E-01 
0.7092509447124836E— 01 
0.9788439120828463E—01 
0.1332509921950535E+00 

0. 1797695570864978E+00 
0.2410654714132133E+00 
0.321 896 1915636380E-f00 
0.4284852078938826E+00 
0.5689615509235298E+00 

0. 753934773693330 lE-f- 00 
0.9972472224438443E+00 

0. 1316964566299846E+01 

0. 1736698582009859E+01 
0.2287418444638146E+01 
0.3010034073439038E+01 
0.3959315495048493E-f01 
0.5210381702393131E+01 
0.6870768194824406E+01 
0.9106577764323245E+01 
0.1221294512896673E+02 

0. 1689348652665484E+02 

0.1383311204046008E— 02 
0.3279869733166365E— 02 
0.5330932895600203E— 02 
0.7646093110803760E— 02 

0. 1037458793227033E-01 

0. 13721 78039022047E-01 

0. 1796868836009351E— 01 
0.2348971809947674E— 01 
0.3076860552710760E— 01 
0.4041894092839717E— 01 
0.5321827718681367E-01 
0.7016094768858448E-01 
0.9253048536912244E— 01 
0.1219928996130354E+00 
0.1607156476580828E+00 
0.2 1 152 15602 167892E+00 
0.2780925850550500E+00 
0.3652478333806065E+00 
0.4793398853949993E+00 

0. 62885542584 16082E+00 
0.8254021100491956E+00 

0. 1085495633209734E+01 
0.1434174907278760E+01 
0.1913323186889750E+01 
0.2604342790201 154E+01 
0.3708436699287805E+01 
0.60230861566 15004E+01 

n 

7 

10 

17 

26 

32 

Error 

0.107E— 02 

0.398E— 04 

0.156E— 07 

0.801E— 12 

0.282E— 14 

The  weights  and  nodes  of  the  quadratures  are  available  electronically  at  the  URL 
http : //www .net lib . org/pdes/mult ipole/pwts4 . f . 

6.3.  Exponentials  multiplied  by  Jo.  In  this  example,  quadrature  formulae 
are  constructed  for  integrals  of  the  form 

(67)  [°°  I0(xy)e-Xtdx, 

Jo 

under  the  condition  that  t  E  [1,500]  and  y  E  [0,  t  —  1];  these  formulae  were  designed 
to  be  used  in  a  version  of  the  one-dimensional  fast  multipole  method  which  is  used 
in  an  algorithm  [6]  for  the  fast  Hankel  transform.  In  this  case  the  singular  functions 
produced  by  the  precomputation  stage  were  extremely  similar  to  those  for  exponen¬ 
tials  alone;  unlike  in  the  case  of  complex  exponentials,  it  is  possible  that  they  form 
a  Chebyshev  system.  In  any  case,  the  algorithm  converged,  producing  a  quadra¬ 
ture  which  required  two  more  nodes  for  double  precision  accuracy  than  were  required 
for  the  integration  of  exponentials  alone.  The  measured  maximum  absolute  error  of 
integration  of  the  produced  quadratures  is  given,  for  selected  n,  in  the  following  table. 


n 

6 

8 

14 

24 

29 

Error 

0.997E— 03 

0.892E— 04 

0.900E— 07 

0.925E— 12 

0.299E— 14 
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Fig.  1.  Range  of  coefficient  z  of  complex  exponentials  to  be  integrated. 


The  weights  and  nodes  of  the  quadratures  are  available  electronically  at  the  URL 
http : //www . netlib . org/pdes/multipole/swt s500 . f . 

6.4.  Exponentials  multiplied  by  Jq.  Here,  we  construct  quadratures  for  the 
integral 


(68)  J0(xy)e~xtdx, 

Jo 

under  the  conditions  that  t  €  [1,4]  and  y  €  [0,4\/2],  and  where  J0  denotes  the  Bessel 
function  of  the  first  kind  of  order  zero.  These  quadratures  are  used  in  a  new  version 
[4]  of  the  three-dimensional  fast  multipole  method.  Jq  is  given  by  the  well-known  (see 
for  instance  [1])  formula 


1  f* 

(69)  Jo(y)  =  ~  /  costy  cos  9)d9. 

*  Jo 

Substituting  (69)  into  (68)  yields  the  integral 

J  J  cos (xy cos Q)dQSJ  e~xtdx 

l  r*  r°° 

(^6)  =  —  J  J  cos(xycos0)e~xtdxd9. 

Thus  a  quadrature  accurate  for  the  integral 


(71) 


rOC 

/  cos(xy) 
Jo 


e~xtdxy 
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under  the  conditions  that  t  £  [1,4]  and  y  £  [0,4\/2],  is  also  accurate  for  the  integral 
(68)  under  the  same  conditions  on  y  and  t.  Since  the  function  cos (xy)e~~xt  is  a 
harmonic  function  of  y  and  t,  by  the  maximum  modulus  principle  the  maximum 
error  of  a  quadrature  for  (71)  lies  on  the  boundary  SD  of  the  rectangular  region  t  £ 
[1,4],  y  £  [0,4\/2].  Accordingly,  the  kernel  whose  singular  functions  were  computed 
was  K(x,  z)  =  cos(zy)e~xt,  with  (£,y)  varying  over  SD .  As  in  the  case  of  complex 
exponentials,  the  singular  functions  have  too  many  zeros  to  form  a  Chebyshev  system; 
however,  the  algorithm  converged. 

The  measured  maximum  absolute  error  of  integration  of  the  produced  quadratures 
is  given,  for  selected  n,  in  the  following  table. 


n 

8 

12 

21 

31 

40 

Error 

0.162E— 02 

0.709E— 04 

0.553E— 07 

0.195E— 10 

0.147E— 13 

The  weights  and  nodes  of  the  quadratures  are  available  electronically  at  the  URL 
http :  //www  .netlib .  org/pdes/mult  ipole/vwts .  f . 

6.5.  Numerical  observations.  The  following  observations  were  made  in  the 
course  of  our  numerical  experiments. 

(i)  The  number  of  continuation  steps  required  is  highly  variable;  in  many  cases, 
only  one  step  sufficed  to  produce  the  quadrature;  less  frequently,  up  to  fifty  or  so 
continuation  steps  were  required.  This  variability  occurred  even  between  quadratures 
for  successive  numbers  n  of  nodes,  with  the  same  weight  function  and  kernel  K . 

(ii)  The  algorithm  worked  in  the  cases  where  Theorem  2.1  applied,  and  also 
in  cases  where  it  did  not.  In  the  latter  cases,  it  is  conceivable  that  the  resulting 
quadratures  would  have  negative  weights  or  that  they  would  not  be  unique.  However, 
all  computed  weights  were  positive,  and,  while  no  systematic  attempt  was  made  to 
look  for  nonuniqueness  of  the  quadratures,  no  instance  of  it  was  observed. 

7.  Generalizations  and  applications. 

(i)  The  success  of  the  algorithm  in  instances  where  Theorem  2.1  does  not  ap¬ 
ply  suggests  that  further  theoretical  investigation  of  conditions  for  the  existence  of 
generalized  Gaussian  quadratures  would  be  profitable. 

(ii)  An  obvious  generalization  of  these  results  is  to  quadratures  for  integrals  in 
more  than  one  dimension.  However,  such  an  extension  does  not  seem  to  have  been 
explored  classically;  the  authors  are  investigating  a  generalization  of  Theorem  2.1  for 
multidimensional  quadratures. 

(iii)  An  obvious  application  of  the  algorithm  of  this  paper  is  for  the  efficient 
evaluation  of  functions  represented  by  their  integral  transforms  (see  sections  6.1,  6.2, 
6.3,  6.4  above,  as  well  as  [5]  and  [4]).  The  method  of  steepest  descent  in  the  numerical 
complex  analysis  provides  a  wide  field  of  applications  for  such  algorithms. 

(iv)  An  entirely  different  field  of  applications  involves  the  numerical  solution  of 
integral  equations  with  singular  kernels;  of  particular  interest  are  boundary  integral 
equations  of  scattering  theory  on  regions  with  corners.  The  authors  are  currently 
pursuing  this  direction  of  research. 


REFERENCES 

[1]  M.  ABRAMOWITZ  AND  I.  STEGUN,  Handbook  of  Mathematical  Functions ,  Applied  Mathematics 
Series,  National  Bureau  of  Standards,  Washington,  DC,  1964. 


718 


N.  YARV1N  AND  V.  ROKHLIN 


[2]  F.  GANTMACHER  AND  M.  KREIN,  Oscillation  Matrices  and  Kernels  and  Small  Oscillations 

of  Mechanical  Systems,  2nd  ed.,  Gosudarstv.  Izdat.  Tehn-Teor.  Lit.,  Moscow,  1950  (in 
Russian). 

[3]  G.  H.  Golub  and  C.  H.  Van  Loan,  Matrix  Computations ,  Johns  Hopkins  University  Press, 

Baltimore,  1983. 

[4]  L.  GREENGARD  AND  V.  Rokhun,  A  new  version  of  the  fast  multipole  method  for  the  Laplace 

equation  in  three  dimensions,  Acta  Numerica,  6  (1997),  pp.  229-269. 

[5]  T.  HRYCAK  AND  V.  ROKHLIN,  An  Improved  Fast  Multipole  Algorithm  for  Potential  Fields , 

Research  Report  1089,  Computer  Science  Department,  Yale  University,  New  Haven,  CT, 
1995. 

[6]  S.  Kapur  and  V.  Rokhlin,  An  Algorithm  for  the  Fast  Hankel  Transform ,  Technical  Report 

1045,  Computer  Science  Department,  Yale  University,  New  Haven,  CT,  1995. 

[7]  S.  KARLIN,  The  existence  of  eigenvalues  for  integral  operators ,  Trans.  Amer.  Math.  Soc.,  113 

(1964),  pp.  1-17. 

[8]  S.  KaRLIN  and  W.  J.  Studden,  Tchebycheff  Systems  with  Applications  in  Analysis  and  Statis¬ 

tics,  John  Wiley  (Interscience),  New  York,  1966. 

[9]  M.  G.  KREIN,  The  Ideas  of  P .  L.  Chebyshev  and  A.  A.  Markov  in  the  Theory  of  Limiting 

Values  of  Integrals,  Amer.  Math.  Soc.  Transl.  2,  AMS,  Providence,  RI,  1959,  pp.  1-122. 

[10]  J.  Ma,  V.  ROKHLIN  and  S.  WaNDZURa,  Generalized  Gaussian  quadrature  rules  for  systems  of 

arbitrary  functions,  SIAM  J.  Numer.  Anal.,  34  (1996),  pp.  971-996. 

[11]  A.  A.  Markov,  On  the  limiting  values  of  integrals  in  connection  with  interpolation,  Zap.  Imp. 

Akad.  Nauk.  Fiz.-Mat.  Otd.  (8)  6  (1898),  no.  5  (in  Russian);  pp.  146-230  of  [12]. 

[12]  A.  A.  Markov,  Selected  Papers  on  Continued  Fractions  and  the  Theory  of  Functions  Deviating 

Least  from  Zero,  OGIZ,  Moscow- Lenin  grad,  1948  (in  Russian). 

[13]  M.  Reed  AND  B.  Simon,  Methods  of  Modem  Mathematical  Physics,  Vol.  1,  Academic  Press, 

New  York,  1980. 

[14]  J.  STOER  AND  R.  BULIRSCH,  Introduction  to  Numerical  Analysis,  2nd  ed.,  Springer- Verlag,  New 

York,  1993. 


AN  INTEGRAL  EVOLUTION  FORMULA  FOR  THE  WAVE 

EQUATION* 

BRADLEY  ALPERTt,  LESLIE  GREENGARD* ,  AND  THOMAS  HAGSTROM§ 

Abstract.  We  present  a  new  time-symmetric  evolution  formula  for  the  scalar  wave  equation.  It 
is  simply  related  to  the  classical  D’Alembert  or  spherical  means  representations,  but  applies  equally 
well  in  two  space  dimensions.  It  can  be  used  to  develop  stable,  robust  numerical  schemes  on  irregular 
meshes. 


1.  Introduction.  It  is  notoriously  difficult  to  construct  stable  high-order  ex¬ 
plicit  marching  schemes  for  the  wave  equation  on  irregular  meshes.  The  time-step 
restriction  is  typically  determined  by  the  smallest  cell  present  in  the  discretization.  In 
this  note,  we  describe  a  new  approach  to  the  construction  of  stable,  explicit  schemes, 
based  on  a  simple  time-symmetric  evolution  formula. 

Initially  we  consider  the  Cauchy  problem  in  Rd, 


utt  =  A  u, 

O-1)  u(x,0)  =  u0(x), 

ut(x,  0)  =  v0(x), 


where  A  denotes  the  Laplacian  operator.  In  one  space  dimension,  the  solution  can  be 
written  using  D’Alembert’s  formula  as 

1  fX  +  t 

(1-2)  u(x,t)  =  ~(u0(x  -  t)  +  u0(x  +  t.))  +  J  Vo (s)ds. 

We  can  eliminate  the  term  involving  the  data  v0(x)  by  using  the  time-symmetric  form: 
(i-3)  u(x,  t)  -f  u{x ,  -t)  =  u{x  -  f ,  0)  +  u(x  4- *, 0). 

In  three  dimensions,  the  analog  of  (1.3)  is  the  spherical  means  formula  [2,  4,  5] 

JZ  [  u(y,0)da  , 

4n  J (y-x|=t 

where  dc  is  an  element  of  surface  area.  In  two  dimensions,  the  situation  is  slightly 
more  complex  because  of  the  absence  of  a  strong  Huygen’s  principle.  The  solution 
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(1.4) 


u{x,t)  +u(x,-t)  =  — 
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depends  not  just  on  function  values  over  the  boundary  of  the  disk  of  radius  t ,  but  on 
all  values  in  its  interior: 


—  /  ^(y-Q)  f 

2"  J\y-x\<t  \/<2  -  |x  -  y|2  y 

For  numerical  computation,  formulas  of  the  type  (1.3),  (1.4),  and  (1.5)  are  not 
widely  used  because  they  do  not  suggest  a  procedure  at  physical  boundaries  and  are 
not  easily  extended  to  more  general  partial  differential  equations. 

2.  A  central  difference  evolution  formula.  Consider  the  Fourier  transform 
of  the  wave  function  u(x.  t),  namely 

U{Kt)  =  (y^)  JR/~'kMx,t)dx. 

The  partial  differential  equation  in  (1.1)  can  then  be  replaced  by 


(1.5)  u(x,  t)  +  u(x,  —t)  =  — 

at 


Uu(k,t)  =  -\k\2U(k.t). 

Solving  this  ordinary  differential  equation,  we  obtain 

U (k,  t )  +  U(k, -t)  =  2U (k,  0)  cos(|k|f) 


or 


(2.1)  U (k,  t)  -  2U (k,  0)  +  U (k,  -t)  = 


2cos(|k|t)  -  2 
-|k|2' 


Hk|2)  U( k,0). 


Our  main  result  follows. 

THEOREM  2.1.  Let  u(x,t)  denote  a  solution  to  the  homogeneous  wave  equation 


utt  =  A  u 


in  Rd.  Then 

(2.2)  u(x,t) 

-  2u(x,0)  +  u(x,  -t)  =  f  Gd(|x-y|,t)Au(y,  0)dy, 

where 

(2.3) 

Gi(r,t)  =  t  —  r 

(2.4) 

G2 (r,  t)  =  1  n(t  +  y/t2  -  r2)  -  In  r 

(2.5) 

G3(r,t)  =  - 

Proof.  The  formula  (2.2)  is  obtained  from  the  convolution  theorem  by  trans¬ 
forming  (2.1)  back  to  physical  space.  We  provide  a  few  more  details  for  two  space 
dimensions,  where  we  need  to  evaluate  the  kernel 


■eikxdk. 
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Changing  to  polar  coordinates,  we  have 


G2(r,t) 


oo  r  2n 


Jq  Jo 

fi 


2  —  2  cos  (kt) 


k 2 


aikr  cos( 0  —  <f>) 


2  —  2  COS  (kt) 


Jo(kr)  dk , 


kdk  d<p 


where  k  =  (kcoscf),  k  sin  <j>),  x  =  (r  cos#,  rsin#),  and  J0  denotes  the  Bessel  function 
of  order  zero.  The  desired  result  now  follows  from  the  formula  ([1],  6.693) 


l 


00  dk 

Ju(kr)  cos(kt)-y- 


-  cos(i/arcsin  -) 


v(t  +  y/t2  —  r2)v 


V'K 


cos  — 
2 


t  <  r 
t  >  r, 


with  some  care  in  taking  the  limit  u  — >  0.  □ 

REMARK  2.2.  Integration  by  parts  and  Green’s  identities  can  be  used  to  recover 
the  formulas  (1.3),  (1.4),  and  (1.5)  from  (2.2). 

Remark  2.3.  Our  evolution  scheme  can  be  viewed  as  an  integral  form  of  the 
widely-used  Lax-Wendroff  method.  The  latter  method  uses  central  differencing  in 
time  to  generate  the  series 


f4  t 6 

u(x,  t)  —  2,u(x,  0)  +  u(x,  —  t)  —  t  Utt(x,  0)  +  ~utttt(x,  0)  +  ■  utttttt (x,  0)  -f*  •  •  • . 

lZ  ,300 

Replacing  the  time  derivatives  with  powers  of  the  Laplacian,  one  obtains 

1 4  f6 

u(x,  t)  -  2 u(x,  0)  +  u(x,  -t)  =  t2 Au(x,  0)  +  —  A2u(x,  0)  +  —  A3u(x,  0)  +  •  •  • . 

1Z  360 

Once  a  numerical  approximation  is  chosen  for  the  Laplacian  operator,  the  Lax- 
Wendroff  scheme  achieves  arbitrary  order  accuracy  in  time  by  incorporating  higher 
and  higher  powers  of  the  Laplacian  in  a  three  time  level  scheme.  Stability  and  spatial 
accuracy  depend,  of  course,  on  how  the  Laplacian  is  computed. 

3.  Forcing.  We  next  consider  the  wave  equation  with  a  source  term 
(3.1)  utt  =  A  u  +  f 

which  from  Fourier  transformation  (u  —>  U,  f  —>  F)  becomes 

^tt(k,  t)  =  — |k|2t/  (k,  t)  +  F(k,  t), 
whose  solution  is  given  by 


U( k,  t)-2U (k,  0)+t/(k, -t)  =  2 [cos(|k|0— 1]  U( k, 0)+  f  !^  (|k|(t  M  F(k,  s) ds. 

J-t  |k| 

The  identity 

sin(|k|t)  d  f  cos(|k|t)  -  1\ 

|k|  ~~Wt{  ) 

and  integration  by  parts,  in  combination  with  (2.2),  now  yield 
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Theorem  3.1.  Let  u(x,  t)  denote  a  solution  to  the  inhomogeneous  wave  equation 
(3.1)  in  Rd.  Then 


(3.2)  u(x,t)-2u(x,0)  +  u(x,-t)  =  [  Gd(|x-y|,f)[Au(y,0)  +  /(y.0)l<iy 

J\y-x\ <t 

+  1  f  signum(s)  [  Gd(|x  -  y|,  t  -  |s|)  /'(y.  s)  dy  ds, 

*  d —  t  J  |y  —  x|<t—  Isl 


where  Gd  is  given  in  (2.3)-(2.5)  and  f'(x.t)  =  df(x.t)/dt. 

Remark  3.2.  The  derivative  /'  of  the  forcing  term  may  be  analytically  removed 
from  (3.2)  by  integration,  yielding  formulas  that  differ  somewhat  for  d  =  1,2,3.  In 
three  dimensions,  for  example,  the  double  integral  reduces  to  the  particularly  simple 
form 


1  J  |y-x 


|y-x|<t 


/(y,  jx  -  y |  -  Q  -  2/(y,  0)  -f  /( y,  t  -  [x  -  yj) 

|x-y| 


dy. 


4.  Discretization.  In  order  to  use  formula  (2.2)  or  (3.2)  for  computation,  we 
need  to  evaluate  the  integral 

(4-1)  Qu(x)  =  f  Gd(|x  —  y|,  t)Au(y,  0)  dy, 

^|y-x|<t 

for  each  discretization  point  x.  In  this  brief  note,  we  will  restrict  our  attention  to  the 
one-dimensional  case.  Away  from  physical  boundaries,  there  are  three  clear  options: 

1.  Use  a  quadrature  formula  designed  for  formula  (4.1): 

rx+t 

(4*2)  Qu(x)=  {t  -  \y  -  x\)uyy(y,0)dy. 

Jx-t 

2.  Integrate  by  parts  once  to  obtain 

f*  fX+t 

(4-3)  Qu(x)  =  -  Uy(y,0)dy+  uy(y,0)dy. 

Jx-t  Jx 

3.  Integrate  by  parts  twice  to  obtain 

Qu  (x)  =  u(x  -  t,  0)  -  2 u(x,  0)  +  u{x  +  t,  0). 

All  three  formulas  are  exact  (the  last  yielding  the  time-symmetric  scheme  (1.3)). 
In  the  first  case,  one  needs  to  approximate  uxx  within  the  domain  of  dependence.  In 
the  second  case,  one  needs  to  approximate  ux  within  the  domain  of  dependence.  In 
the  third  case,  one  needs  to  interpolate  u(x  -  t,  0)  and  u(x  +  t ,  0)  from  the  possibly 
irregular  mesh  points  where  u(x,  0)  is  known.  The  stability  of  each  scheme  will  depend 
on  how  the  interpolation/approximation  problem  is  handled. 

To  demonstrate  the  value  of  the  integral  formulation,  we  suppose  that  we  are 
solving  the  problem  (1.1)  with  the  Dirichlet  boundary  condition  u(0,t)  =  g(t).  For 
the  sake  of  simplicity,  we  assume  that  the  grid  spacing  in  x  is  equal  to  the  time  step 
t.  The  only  irregular  point  is  the  first  grid  point  xx  which  is  arbitrarily  close  to  the 
boundary  x  =  0,  creating  what  is  often  referred  to  as  a  small  cell  problem  (Fig.  4.1). 
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Fig.  4.1.  An  irregular  mesh  in  one  space  dimension.  The  grid  points  xi,  12,13,...  are  eq~ 
uispaced,  but  the  first  grid  point  is  near  the  physical  boundary  x  =  0.  At  regular  grid  points ,  the 
symmetric  stencil  (1.3)  is  used.  For  the  node  x\,  the  interpolator  scheme  described  in  section  4.2 
uses  the  indicated  stencil.  It  requires  values  at  the  irregular  points  marked  by  darkened  circles. 


•  •  • 

time  =  t 


time  =  0 


time  =  -t 


For  nodes  other  than  xi,  we  can  use  any  of  the  three  options  outlined  above.  For 
x  <  x\  <  t,  let  us  define 

/•i+r 

(4.4)  u(x,t)  =2u(x,0)  -u(x,-t)+  /  (r  -  \y  -  x|)  ^(y.O)  dy. 

J  0 

Note  that  u  satisfies  the  wave  equation  exactly,  under  the  assumption  that  the  function 
uxx{xt  0)  is  extended  outside  the  domain  x  >  0  by  zero.  Taking  into  account  the 
Dirichlet  data,  it  is  straightforward  to  verify  that  the  exact  solution  is 

(4*5)  u(xi , 0  =  u(xut)  +  g(t  -  xx)  -  u( 0, t  -  xi). 

4.1.  Quadrature  schemes.  The  most  straightforward  use  of  the  quadrature 
approach  is  to  compute  uxx  at  time  t  =  0  by  a  finite  difference  method  of  kth  order 
accuracy.  We  can  then  integrate  the  formula  (4.2)  or  (4.4)  exactly  for  a  polynomial 
approximant  of  uxx  of  degree  k  —  1.  For  k  =  2  this  involves  computing  the  second 
derivative  using  the  usual  3-point  stencil  at  regular  grid  points  and  a  one-sided  4-point 
stencil  for  the  irregular  points  x  =  0,  x\.  The  necessary  quadratures  are  easy  to  derive 
for  a  piecewise  linear  approximation  of  uxx. 

4.2.  Interpolation  schemes.  Integrating  by  parts  yet  again,  we  can  rewrite 
the  formula  (4.4)  for  u(xi,t)  as 

u(xut)  =  -u(x, -t)  +  t*(xi  +t,0)  +u(0,0)-  (t-xiK(0,0). 

Combining  this  result  with  (4.5),  we  have 

u{xi  ,t)  =  -u(xi,-i)  +  u(xi  0) 

(4-^)  +d{t  —  Xi)  +  g(—t  +  Xi)  +  u(t  ~  xi,  0). 

For  regular  grid  points,  we  use  the  exact  formula  (1.3).  Once  we  choose  a  method  for 
approximating  the  values  g(t-xi),  g(-t+x j),  and  u^-x^O),  we  have  a  well-defined 
evolution  scheme.  In  our  numerical  experiments,  we  assume  the  Dirichlet  data  g(t)  is 
known  analytically,  so  that  we  only  need  to  interpolate  u(t  -  Xi,0). 

4.3.  Extrapolation  schemes.  As  a  final  alternative,  one  can  try  to  use  the 
time  symmetric  formula  (1.3)  for  all  grid  points.  This  involves  the  value  u(xx  -  t,  0), 
which  requires  extrapolation  from  the  known  data  at  x  =  0,  x1?  x2, . . . 
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Table  5.1 

Performance  of  the  quadrature ,  interpolation ,  extrapolation ,  and  leapfrog  schemes.  The  first 
column  lists  the  number  of  subintervals  in  the  uniform  gnd  region.  The  second  through  the  fifth 
columns  list  the  L2  error  from  using  the  indicated  evolution  scheme  after  N  steps. 


N 

E2(Q2) 

E2(1 1) 

E2(I3) 

~E2(X  1) 

E2(LF2) 

16 

0.58- 10“4 

0.31  •  10-e 

0.79  •  10-1U 

0.17-  1054 

6.35  •  107& 

32 

0.12- 10"4 

0.16- 10-6 

0.97  - 10"11 

0.39  -  10104 

0.42  •  10150 

64 

0.28- 10“5 

0.81  ■  10~7 

0.11  •  10~u 

— 

_ 

128 

0.67-  1CT6 

0.41  •  10-7 

0.14  - 10“12 

_ 

_ 

256 

0.16-  1CT6 

0.19  - 10-7 

0.27  •  10-13 

- 

• 

5.  A  numerical  example.  We  have  implemented  simple  versions  of  the  various 
methods  described  above:  the  second  order  quadrature  scheme  (Q2),  the  interpolation 
scheme  using  linear  and  cubic  interpolation  (71,  73),  and  the  extrapolation  scheme 
using  linear  approximation  (A  1).  For  the  sake  of  comparison,  we  use  the  same  values 
of  uxx  as  in  the  quadrature  approach,  but  march  using  the  simplest  leapfrog  scheme 

(S-1)  u[x,  t)  =  2 u(x,  0)  -  u(x,  -t)  -I-  t2uxx(x,  0). 

We  will  denote  this  method  by  LF2. 

consider  the  wave  equation  on  [0, 1]  as  an  initial/boundary- value  problem 
with  exact  solution  sin(x  -  t)  +  sin(x  -  t  -  ±).  We  set  x\  =  1.0  •  10-5,  zyv+i  = 
1  -  1.0  ■  10-6,  and  place  TV  -  1  equispaced  points  on  the  interval  [zi, :r;v+1].  With 
A  —  16, 32, 64, 128, 256,  both  the  first  and  last  cells  are  extremely  small  in  comparison 
with  At  =  (rryv+j  -  xx)/N .  The  calculation  is  terminated  after  N  steps,  at  which 
point  we  measure  the  L2  error  of  the  solution.  The  scheme  used  at  the  right  boundary 
(x  =  1)  is  analogous  to  the  one  described  above  at  the  left  boundary  (x  =  0). 

Results  of  the  methods  Q2,  71,  73,  AT,  LF2  are  summarized  in  Table  5.1. 

<32,  71,  and  73  appear  to  be  stable,  while  both  the  extrapolation  and  leapfrog 
schemes  diverge.  It  is  also  worth  noting  that  Q2  is  globally  second  order  accurate, 
71  is  globally  first  order  accurate,  and  73  is  globally  third  order  accurate.  This  is 
consistent  with  a  straightforward  local  error  analysis.  The  reason  that  the  first  order 
scheme  71  is  more  accurate  than  Q2  for  small  N  is  that  we  are  using  an  exact  formula 
away  from  the  irregular  nodes  in  the  former  and  a  second  order  accurate  quadrature 
at  all  points  in  the  latter. 

6.  Conclusions.  We  have  derived  a  new  exact  representation  for  solutions  of 
the  wave  equation.  Theorem  2.1  and  theorem  3.1  may  be  of  analytical  interest  in 
their  own  right,  but  we  have  concentrated  in  this  note  on  exploring  some  numeri¬ 
cal  consequences.  We  believe  that  marching  schemes  based  on  this  approach  have 
advantageous  stability  properties  when  compared  to  existing  methods,  most  notably 
in  removing  the  ‘‘small  cell”  problems  which  arise  when  using  unstructured  grids  or 
regular  Cartesian  meshes  in  complex  geometries.  Although  small  cells  can  be  easily 
eliminated  in  one  dimension,  at  some  cost  in  accuracy,  doing  so  in  two  or  three  di¬ 
mensions  is  more  complicated  and  results  in  greater  loss  of  accuracy.  Furthermore, 
higher-order  discretizations  require  small  cells  near  the  boundary  to  avoid  the  Runge 
phenomenon. 

We  have  illustrated  the  advantages  in  the  simplest  one-dimensional  model  prob¬ 
lem,  but  the  extension  to  higher  dimensions  is  straightforward.  Suppose,  for  example, 
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that  we  are  solving  the  wave  equation  in  a  domain  fi  C  Rd.  If  a  point  x  is  within  a 
time  step  t  of  the  domain  boundary  dfl,  we  define  the  function 

(6.1)  u(x,  t)  =  2u(x,  0)  —  u(x,  — r)  +  f  Gd(\x-y\,T)Au(y,0)dy 

JsTnn 

where  ST  =  {y  :  |y  —  x|  <  r}.  Whereas  in  one  dimension,  the  exact  solution  is  given 
by  (4.5),  it  is  now  of  the  form 

(6.2)  u(x,  t)  =  fi(x,  t)  4-  B(d£l,  u,  g). 

The  operator  B(dfl,u,g)  describes  the  exact  solution  to  the  Dirichlet  problem  with 
zero  initial  data  and  boundary  condition  g{x,t)  -  i(x,t).  This  can  be  written  out 
explicitly  in  terms  of  hyperbolic  potential  theory  and  can  easily  be  generalized  to 
Neumann  or  Robin  boundary  value  problems. 

It  is  not  surprising,  perhaps,  that  robustness  and  stability  come  at  a  price.  In 
our  formulation,  that  price  is  the  construction  of  appropriate  quadratures  for  both 
the  volume  integral  in  (6.1)  and  the  boundary  operator  B{d^l^u)g)  in  (6.2).  Higher 
dimensional  examples,  higher-order  discretizations,  and  stability  estimates  will  be 
reported  at  a  later  date. 
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1  Introduction 

Gaussian  quadratures  are  a  classical  tool  of  numerical  integration  and  possess  several  de¬ 
sirable  features  such  as  uniqueness  of  nodes,  positivity  of  weights,  and  an  optimal  number 
of  nodes:  an  A’-point  Gaussian  rule  is  exact  for  all  polynomials  of  orders  up  to  2N  -  1. 
and  no  A-point  rule  is  exact  for  all  polynomials  of  order  2A .  Since  many  situations  require 
high-order  quadratures  in  dimensions  greater  than  one,  a  number  of  attempts  have  been 
made  to  construct  quadrature  rules  in  two  dimensions  that  resemble  the  Gaussian  ones.  Of 
particular  interest  are  quadrature  rules  on  triangles,  which  are  a  standard  tool  for  describing 
surfaces,  and  in  many  other  situations. 

One  widely  used  approach  is  the  naive  tensor  product  rule,  based  on  one  dimensional 
quadratures.  This  approach  is  effective  when  the  region  of  integration  is  a  parallelogram. 
It  is  fairly  straightforward  to  construct  “tensor  product”  quadrature  rules  on  triangles  (see. 
for  example,  [12])  and  on  certain  other  polygons.  However,  the  resulting  quadrature  rules 
are  less  efficient  than  those  on  rectangles.  Furthermore,  tensor  product  rules  lack  symmetry 
on  triangles,  a  convenient  feature  for  programming. 

Lyness  and  Jespersen  performed  an  exhaustive  study  of  quadrature  rules  on  triangles, 
and  developed  two  types  of  fairly  efficient  rules  which  they  termed  “holistic”  and  “cytolic”! 
They  generated  rules  of  orders  up  to  twelve  [6].  Berntsen  and  Espelid  constructed  quadra¬ 
ture  rules  of  degree  13  for  the  triangle  [1]. 

We  present  a  scheme  for  the  generation  of  reasonably  high-order  quadratures  for  poly¬ 
nomials  on  triangles  in  I?2.  The  scheme  is  based  on  the  simple  observation  that  integrals 
over  regularly  shaped  regions  are  invariant  under  certain  transformations.  It  is  essentially  a 
formalization  and  generalization  of  the  approach  used  in  [6].  With  this  scheme,  quadrature 
rules  of  orders  up  to  30  on  triangles  have  been  obtained. 

The  structure  of  this  paper  is  as  follows.  In  Section  2  we  introduce  mathematical  and 
numerical  preliminaries.  In  Section  3  we  develop  the  analytical  apparatus  used  in  the 
construction  of  the  quadrature  rules.  We  describe  our  scheme  in  Section  4,  and  illustrate  it 
with  the  construction  of  quadratures  on  a  standard  triangle  in  Section  5.  Finally  Section  6 
contains  discussions  and  conclusions. 

2  Mathematical  and  Numerical  Preliminaries 

In  this  section,  we  collect  the  relevant  mathematical  and  numerical  tools  to  be  used  in 
Section  3. 

2.1  Representation  Theory 

Following  is  a  summary  of  several  elementary  facts  about  representations  of  finite  groups; 

a  more  detailed  discussion  on  this  subject  can  be  found,  for  example,  in  [13], 

Suppose  that  Q  is  a  region  in  the  rry-plane  and  G  the  symmetry  group  of  Q.  As  is 
well-known  from  the  theory  of  representations,  the  points  of  Q  may  be  partitioned  into 

G-orbits,  each  of  which  is  an  equivalence  class  on  Q  with  respect  to  the  relation  defined  by 
the  group  G. 
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Similarly,  the  function  spaces  on  Q  may  be  partitioned  into  subspaces,  each  containing 
functions  that  transform  according  to  a  particular  irreducible  representation  (IR)  of  G. 
Furthermore,  the  inner  product,  defined  as 


(/*»  fj )  -  J  J  fi(xi  V )  ’  fj(x-.  V )  dx  dy , 


(1) 


vanishes  if  /t  and  fj  transform  according  to  distinct  IRs. 

Two  immediate  consequences  follow  from  the  preceding  fact: 

•  Since 

ffQf(x,y)dxdy  =  (lJ),  (2) 

any  function  not  belonging  to  the  identity  representation  of  G  integrates  to  zero. 

•  If  the  set  {(®i,  y<)},  i  =  1, 2, . . . ,  n,  constitute  a  G-orbit  in  Q,  then  the  function 


belongs  to  the  identity 


n 

ZSix-xuy-yi) 

i=  1 

representation;  thus 


52f(xi,Vi)  =  0 

i=  1 

for  any  /  belonging  to  any  IR  other  than  the  identity. 

In  other  words,  when  constructing  a  quadrature  rule  that  is  invariant  under  G,  we  need  only 
adjust  the  weights  and  abscissae  to  correctly  integrate  functions  belonging  to  the  identity 
representation;  all  functions  belonging  to  nontrivial  representations  are  integrated  exactly. 

Conveniently,  the  operator  that  projects  onto  the  function  subspace  transforming  ac¬ 
cording  to  the  identity  representation  is  given  by  a  sum  over  transformed  functions: 

1  m 

(pEf)(z,  y)  =  —  Y.  $9,  (/)  (*.  y) ,  (3) 

where  $9l(f)  denotes  that  transformation  of  f(x,  y)  according  to  gt  e  G,  and  m,  the  order 
of  G. 

We  will  denote  the  standard  equilateral  triangle  in  R2  by  T  (see  Figure  2.1).  In  this 
case,  the  symmetry  group  is  usually  denoted  by  D3,  and  is  of  order  6.  The  points  of  T  are 
naturally  classified  by  D3-orbits.  each  orbit  containing  one,  three,  or  six  points.  The  first 
class  consists  of  the  center  of  the  triangle;  the  second  class  consists  of  the  union  of  three 
medians  minus  the  center  of  the  triangle;  the  third  class  consists  of  all  points  on  T  not 
belonging  to  the  first  and  second  classes. 
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Figure  2.1.  An  Equilateral  Triangle 
2.2  Simulated  Annealing 

Simulated  annealing  is  a  numerical  technique  for  solving  combinatorial  optimization  prob¬ 
lems,  originally  developed  by  Kirkpatrick  et  al  [4,  5].  The  algorithm  draws  an  analogy 
between  the  behavior  of  a  physical  system  with  many  degrees  of  freedom  in  thermal  equi¬ 
librium  at  a  series  of  finite  temperatures  as  encountered  in  statistical  physics,  and  the  prob¬ 
lem  of  finding  the  minimum  of  a  given  function  of  many  parameters  as  in  combinatorial 
optimization.  It  was  based  on  a  simple  idea  [3]: 

When  optimizing  a  very  large  and  complex  system  (i.e.,  a  system  with  many 
degrees  of  freedom) ,  instead  of  “always”  going  downhill,  try  to  go  downhill  “most 
of  the  time.” 

There  are  three  main  components  in  any  application  of  simulated  annealing  method; 
they  are: 


•  Configure  the  optimization  problem  into  a  many-body  physical  system  with  states  S. 

•  Construct  a  scalar  objective  function  E(S),  which  corresponds  to  the  energy  function 
of  a  physical  system;  the  optimization  problem  then  becomes  finding  the  minimum 
energy  configuration  of  the  physical  system. 

•  Construct  a  system  of  random  state  modifications  (or  updates)  that  obey  detailed 
balance  [8],  the  random  state  modifications  must  ensure  that  any  allowable  state  of 
the  system  is  reachable. 
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•  Develop  the  annealing  (or  cooling)  schedule  which  governs  the  convergence  of  the  algo¬ 
rithm.  The  annealing  schedule  includes  the  initial  temperature  setting,  the  decrement 
of  the  temperature,  and  the  final  value  of  the  temperature;  at  any  given  temperature, 
the  annealing  process  proceeds  according  to  the  Metropolis  algorithm. 

The  Metropolis  algorithm,  based  on  Monte  Carlo  techniques,  developed  in  1953.  was 
originally  designed  to  compute  the  properties  of  systems  in  thermal  equilibrium.  Following 
is  a  summary  of  the  algorithm;  details  can  be  found,  for  example,  in  the  original  paper  [8] 
by  Metropolis  et  al. 

Initially,  the  system  is  in  state  So  with  an  energy  E(So ) .  In  each  step  of  the  algorithm, 
a  state  Si  of  the  system  is  altered  to  Si  according  to  the  random  update  scheme,  and  a 
resulting  change  A E  in  the  energy  of  the  system  is  computed: 

A E  =  E(Si)  -  E(Si).  (4) 

If  A E  <  0,  the  update  is  accepted,  and  the  system  evolves  to  the  new  state  5,;  if 
A E  >  0,  the  update  is  accepted  with  a  probability  P(AE),  where 

P(AE)=exp(-AE/T),  (5) 

ant  T  is  the  absolute  temperature. 

The  choice  of  P(AE)  ensures  that  at  a  temperature  T  approaching  zero,  only  states 
with  minimum  energy  have  a  nonzero  probability  of  occurrence.  When  the  temperature  is 
lowered  in  a  sufficiently  slow  manner,  the  system  can  achieve  thermal  equilibrium  at  each 
temperature,  and  therefore  achieve  a  minimum  energy  state  at  the  low  final  temperature. 


2.3  Newton’s  Method 

Newton  s  method  is  an  iterative  method  for  solving  equation  systems  of  the  form 


F(x)  = 


fl  1  •  %2 ;  ■  •  •  ;  ) 

fn  (^-1 )  2-2 >  •  •  •  >  Xn) 


=  0. 


(6) 


Definition  2.1  The  Jacobian  DF  of  function  F  in  equation  (6)  is  defined  by: 


DF(x)  = 


oxi 


Ab, 

dXn 


d/n  d/n  I 

L  dx\  '  *  *  dxn  J 


(7) 


Theorem  2.1  (Newton’s  Method)  Let  F  :  Rn  — ►  Rn  be  continuously  differentiable  in  the 
neighborhood  of  £  where 


F(0  =  0  (8) 

Suppose  that  Jacobian  DF(x)  is  nonsingular  at  point  x.  Given  a  starting  point  xo  E  Rn, 
define  sequence  Xi,  x2:  •  •  • ,  of  Rn  as  the  following: 

xk+\=xk- DF{xk)~lF{xk).  (9) 
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Then  there  exists  e,  e  >  0  and  for  all  x,  ||rr  -  i0||  <  e,  there  exists  6  >  0  such  that 

IN-rl  ~X\\  <  ~X\\2  (10) 

In  other  words,  sequence  (9)  converges  to  £  quadratically  if  the  initial  point  x0  is  sufficiently 
close  to  f . 

3  Analytical  Apparatus 

In  this  section,  we  develop  analytical  tools  used  in  Section  4  in  the  numerical  construction 
of  the  quadrature  rules.  For  simplicity,  we  assume  that  the  integration  region  Q  belongs  to 
i?2;  generalization  to  higher  dimensions  should  be  straightforward. 


3.1  Notations 

Definition  3.1  A  monomial  of  order  n  in  R2  is  any  term  of  x  and  y  of  the  form 

2  (11) 

where  ri\,n2  are  integers,  0  <  n\,n2  <  n  and  ni  +  =  n.  We  denote  the  set  of  all 

monomials  of  orders  less  than  n  by  M(n). 

Definition  3.2  The  order  of  a  quadrature  rule  is  the  lowest  order  of  monomials  for  which 
the  rule  is  inexact.  We  denote  it  by  O. 

Definition  3.3  The  efficiency  E  of  a  quadrature  rule  is  the  ratio  of  the  number  of  inde¬ 
pendent  monomials  (up  to  a  certain  order)  for  which  the  quadrature  rule  is  exact,  to  the 
number  of  free  parameters  of  the  quadrature  rule. 

Example  Suppose  that  O  is  the  order  of  a  quadrature  rule  on  an  integration  region  Q, 
and  N  is  the  number  of  quadrature  nodes.  Then  the  number  of  independent  monomials 
in  M{0)  is  given  by  — ■■2+  and  the  number  of  ostensibly  free  parameters  for  IV-point 
quadrature  is  3 N.  Therefore,  the  efficiency  E  of  the  quadrature  is  For  some 

regions  of  integration  such  as  the  surface  of  the  sphere,  the  number  of  natural  variables 
of  the  polynomials  (x.y,z  for  the  spherical  shell)  is  greater  than  the  dimensionality  of 
the  region  of  integration,  therefore  these  relations  may  be  different.  In  particular,  for  the 
spherical  shell,  \M{0)\  =  and  E  — 

Definition  3.4  A  quadrature  on  integration  region  Q  is  said  to  be  group  invariant  if  it  is 
invariant  under  the  transformation  of  every  group  element  g  in  Q's  symmetry  group  G 

3.2  Reduction  of  Dimensionality 

Given  an  integration  region  Q,  we  seek  quadrature  rules  of  order  O  with  minimum  number 
of  nodes  N  that  possess  the  following  properties: 

1.  The  quadrature  rule  is  group  invariant; 
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2.  All  weights  are  positive:  quadrature  rules  with  negative  weights  are  unstable  w'ith 
noisy  integrands; 

3.  All  quadrature  nodes  are  within  the  integration  region  Q  (including  the  boundary). 

We  evaluate  the  resulting  quadratures  according  to  the  efficiency  E  defined  in  the  preceding- 
section.  ° 

Based  on  the  results  of  Section  2.1,  quadrature  rules  of  the  form 

i  m 

22  w*-  —  £$sz  (/)(**,  y.-) 

i  m  i=i 

automatically  integrate  correctly  all  functions  not  belonging  to  the  identity  representation. 
Thus  if  we  adjust  {u;;}  and  {(xi,z/i)}  so  that  the  quadrature  integrates  correctly  all  polyno¬ 
mials  belonging  to  the  identity  representation  up  to  a  certain  degree,  the  rule  will  be  correct 
for  all  polynomials  up  to  that  degree.  This  reduces  considerably  the  number  of  nonlinear 
equations  one  must  solve  to  obtain  a  quadrature  rule. 


4  Construction  of  Quadratures 


We  now  construct  group  invariant  quadrature  rules  with  the  mathematical  and  numerical 
apparatus  developed  in  Sections  2  and  3. 

Remark  Given  an  integration  region  Q,  the  symmetry  group  G  is  either  finite  or  infinite 
If  G  is  finite,  we  seek  quadrature  rules  that  are  invariant  to  the  entire  group:  otherwise, 
we  select  some  maximal  subgroup  for  which  a  group  invariant  quadrature  rule  exists,  and 
construct  quadratures  accordingly.  In  some  cases,  the  size  of  the  symmetry  group  may  grow 
with  the  number  of  nodes  in  the  quadrature;  an  excellent  example  of  this  is  the  circle,  where 
the  order  of  the  maximal  subgroup  equals  the  number  of  nodes  N. 

Due  to  Section  2.1,  group  invariant  quadrature  nodes  may  be  partitioned  into  G-orbits 
where  G  is  the  symmetry  group.  We  parameterize  the  z-th  G-orbit  by  x({A,}),  where 
{Aj}  =  Aji, ... ,  A  iu,  and  u  is  determined  by  the  degrees  of  freedom  of  the  orbit  (eg.,  0  <  u  <  2 
if  the  integration  region  Q  is  in  R2).  We  denote  the  number  of  points  contained  in  the  z-th 
orbit  by  m,-,  and  the  corresponding  weight,  wt. 

We  compute  the  quadrature  nodes  x({Aj})  and  weights  wi  by  solving  the  following 
non-linear  system: 


A 


Ewni/iMfAi})]-/! 

i=l 

A 


X^mi/2  [x  ({ Aj})]  -  I2 
i=  1 


o, 

0, 


(12) 


where  A  is  the  number 


22  wimifn  [X  ({Ai})]  -  /„  = 

2=1 

of  distinct  orbits  occupied  by  the 


:  0. 

quadrature  nodes. 


h 


(13) 
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and  /i>  H-.  •  •  •  ,  fn  are  the  set  of  polynomials  to  be  evaluated. 

Due  to  Section  2.1,  a  group  invariant  quadrature  will  automatically  be  correct  for  any 
polynomial  that  is  orthogonal  to  the  subspace  of  group  invariant  polynomials.  Therefore  we 
only  need  to  evaluate  polynomials  that  transform  according  to  the  identity  representation 
of  G\  an  appropriate  choice  of  the  set  of  polynomials  to  be  evaluated  would  be  a  group 
invariant  orthogonal  basis  (up  to  a  certain  order)  on  region  Q ,  which  may  be  obtained  via 
equation  (3). 

We  use  Newton’s  method  to  solve  the  non-linear  system  (12),  with  the  iterative  sequence 
defined  by  equation  (9):  this  process  converges  quadratically  due  to  Theorem  2.1.  Following 
are  the  formulae  of  partial  derivatives  of  individual  functions  with  respect  to  weights  and 
parameters  (of  nodes),  which  are  needed  in  the  Newton’s  method: 


dfj 


dwi 


i  —  1,2, 


(14) 


2Il  =  w  MMiMl) 

d\k  1 '  d\ik 


i  1,2,...,  A,  k  —  1,...,  ni , 


(15) 


where  m  is  the  number  of  parameters  of  the  z'-th  orbit;  in  the  case  of  T,  m  =  0,  1.  or  2. 
respectively  for  orbits  containing  1,  3,  or  6  points. 

As  is  well-known,  Newton’s  method  is  extremely  sensitive  to  the  choice  of  the  initial 
approximation  x0  (see  Section  2.3).  In  practice,  the  non-linear  system  (12)  is  often  under- 
constrained:  the  number  of  equations  that  can  be  solved  with  weights  that  are  positive 
and  nodes  that  are  in  the  region  of  integration  is  smaller  than  the  number  of  unknowns 
A+_  Ei  Hi-  Simulated  annealing  provides  a  tool  under  such  circumstances  for  finding  the 
initial  approximation  xq:  we  defined  the  objective  function  J  via  the  formula: 


Ti  Ai)}) 


(16) 


Our  implementation  of  the  method  follows  closely  the  standard  procedure  set  forth 
in  [4],  with  a  randomly  selected  starting  configuration  S0,  and  randomly  chosen  small  dis¬ 
placements  of  nodes  and  weights  at  each  step.  The  decrement  of  annealing  temperature  is 
defined  by 

Tk  =  aTfc_!  (17) 

where  a  is  a  constant  smaller  than  but  close  to  1.  Sometimes  the  cooling  process  fails,  and 
we  need  to  adjust  the  temperature  manually.  Throughout  the  process,  any  weight  that  is 
negative  after  the  random  displacement  is  set  to  zero. 


5  Quadratures  on  the  Triangle 

We  have  implemented  the  numerical  scheme  described  in  Section  4  on  the  triangle  T  (see 
Figure  2.1)  and  obtained  rules  of  orders  up  to  30.  Any  other  triangle  may  be  mapped  onto 
T  via  an  affine  transformation. 
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5.1  Parameterization  of  Quadrature  Nodes 

We  parameterize  a  point  on  the  triangle  with  three  dependent  variables  xii,u2,  and  .  with 
the  constraint 

Ui+U2  +  U3  =  l.  (18) 

The  variables  u\,u2,uz  are  related  to  the  Cartesian  variables  x.y  via  the  following 
formulae:  0 


X  = 

2u\  —  U2  — 

2 

(19) 

y  = 

\/3(U2  -  Uz) 

2 

(20) 

U\  = 

1  +  2® 

3 

(21) 

u2  = 

1  —  X  +  y/Zy 

3 

(22) 

u3  = 

1  —  X  —  y/2>y 

3 

(23) 

5.2  Group  Invariant  Orthogonal  Polynomials 


Table  1:  Group  Invariant  Orthonormal  Polynomials  on  the  Triangle 

One  set  of  orthogonal  polynomials  on  T  is  given  by  the  direct  product  of  Jacobi  poly¬ 
nomials  and  Legendre  polynomials: 
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(1  (24) 

where 

Rnm(z)  =  Pl™*lfi(z).  (25) 

Using  the  projection  operator  specified  in  Section  2.1,  we  obtain  the  group  invariant 
orthogonal  basis  on  T ;  the  normalized  basis  polynomials  of  orders  less  than  8  are  shown  in 
Table  1. 

5.3  Numerical  Results 


Figure  5.3.  Triangle  T  and  the  quadrature  nodes  for  O  =  30. 


In  agreement  with  the  criteria  specified  in  Section  3.2,  all  quadratures  we  obtained 
have  nodes  and  weights  that  are  invariant  under  the  action  of  Dz.  In  Table  2,  we  list 
the  quadrature  rules  of  orders  0  =  5, 10,20,  and  30;  the  quadrature  nodes  axe  partitioned 
into  three  types  of  orbits  as  described  in  Section  2.1,  and  each  orbit  is  listed  as  a  set  of 
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parameters  {A;},  i  =  0,1-2;  all  weights  are  normalized  so  that  the  integral  of  any  basis 
polynomial  on  T  is  one. 


0 

Weights  (u’i) 

Nodes 

Ai  Ao 

5 

2.250000000000000E  -  01 

1.323941527885062E  -  01 

5.971587178976982E-02 

1.25939180544827LE  —  01 

7.974269853530873E  -  01 

10 

8.3523399805 19637E  -  02 

7.229850592056742E  -  03 

4.269134091050350E -  03 

7.4492 17792098051E-02 

1.439751005418876E-01 

7.864647340310853E  -  02 

6. 304871 745 135509E  -  01 

6.9283230871 07503E-  03 

9.590375628566449E-01 

6.928323087107503E  -  03 

3.50029898972720 IE  -  02  1.365735762560334E  -  01 

2.951832033477940E  -  02 

3.50029898972720LE7  -  02 

2.951832033477940E -  02 

3.754907025844263E  -  02  3.327436005886387E  -  01 

3.957936719606124E-  02 

3.754907025844263E  -  02 

20 

2.761042699769952E-02 

1.779029547326740E  -  03 

1.500649324429017E-03 

2.011239811396117^7  —  02 

9.413975193895086E  -  02 

2.681 784725933157E- 02 

2.044721240895264E  -  01 

2.452313380150201E  -  02 

4.709995949344253E  -  01 

1.639457841069538E  -  02 

5.779620718158465E  -  01 

1.479590739864960E-02 

7.845287856574573E  -  01 

4.579282277704251E  -  03 

9.218618243243946E  -  01 

1.651826515576217E-03 

9.776512405413408E-01 

1.651826515576217E  -  03 

5.349618187337239E -  03  6.354966590835223E  -  02 

2.349 1 70908575584E  -  03 

5.349618187337239E  -  03 

2.349170908575584E-03 

7.954817066198923E  -  03  1.571069189407069E  -  01 

4.465925754181793E  —  03 

7.954817066198923E  -  03 

4.465925754181793E-03 

1.042239828126384E  -  02  3.956421143643740E-01 

6.099566807907971E  -  03 

1.042239828126384E  -  02 

6.099566807907971E  -  03 

1.096441479612335E  -  02  2.731675707129105E  -  01 

6.891081327188203E  -  03 

1.096441479612335E  -  02 

6.891081327188203E  -  03 

3.856671208546238E  -  02  1.017853824850170E-01 

7.997475072478161E  -  03 

3.856671208546238E  -  02 

7.997475072478161E-03 

3.558050781721823E  -  02  4.466585491764138E-  01 

7.386134285336023E-03 

3.558050781721823E  -  02 

7.386134285336023E-  03 

4.967081636276412E  -  02  1.990107941495031E  -  01 

1 .279933187864826E  -  02 

4.967081636276412E  -  02 

1.279933187864826JE7  -  02 

5.851972508433171E  -  02  3.24261 1836922827E  -  01 

1.725807117569655E-  02 

5.851972508433171E  -  02 

1.725807117569655E-02 

1.214977870043943E  -  01  2.085313632101329E  -  01 

1 .867294590293547E  -  02 

1.214977870043943E  -  01 

1 .867294590293547E  -  02 

1 .4071 08449439387E  -  01  3.231705665362575E  -  01 

2.281822405839526E  -  02 

1.407108449439387E  -  01 

30 

1.557996020289920E  -  02 

3. 177233700534134E  -  03 

7.3301 16432765550E- 03 

1.048342663573077E  -  02 

8.299567580296455E  -  02 

1.320945957774363E-02 

1.509809561254103E  -  01 

1 .4975006966271 50E  -  02 

2.359058598921665E  -  01 
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o 

Weights  (w{) 

Nodes  | 

Ai  Ao 

1.498790444338419E-02 

4.38024308407848LE  -  01 

1.3338864 74102166E  — 02 

5.4530204829193121?  -  01 

1.0889171 11390201E  — 02 

6.5088177698254031?  -  01 

8.189440660893461E  —  03 

7.5348314559712681?  -  01 

5.575387588607785E  -  03 

8.3983154221560631?  -  01 

3.191216473411976E  — 03 

9.044510651842024E  -  01 

1.296715144327045E-03 

9.5655897063971701?  -  01 

2.982628261349172E  -  04 

9.9047064476912611?  -  01 

2.982628261349172E  -  04 

9.25371 1933464866E  -  04  4.152952709133117E  -  01 

9.989056850788964E  -  04 

9.2537119334648661?  -  04 

9.989056850788964E  -  04 

1.385925855563978E  -  03  6.118990978534904E  -  02 

4.6285084917325331?  -  04 

1.3859258555639781?  -  03 

4.628508491732533E  —  04 

3.682415455910755E  -  03  1. 64908690 1369066E  -  01 

1.2344513363824131?  -  03 

3.6824154559107551?  -  03 

1.234451336382413E-03 

3.9032234241593661?  -  03  2.50350622320025  IE  -  02 

5.707198522432062E  -  04 

3.903223424159366E  -  03 

5.707198522432062E-04 

3.233248155010538E  -  03  3.060644651510958E  -  01 

1.126946125877624E-03 

3.2332481550105381?  -  03 

1.126946125877624E  -  03 

6.4674321 12236475E  -  03  1.070732837302181E  -  01 

1.747866949407337E  -  03 

6.4674321122364751?  -  03 

1 . 7478669494073371?  -  03 

3.247475491332623E  -  03  2.299575493455843E  -  01 

1.182818815031656E  —  03 

3.2474754913326231?  —  03 

1.182818815031656E-03 

8.6750908067537631?  -  03  3.3703663330578301?  -  01 

1 .990839294675034E  -  03 

8.675090806753763E  -  03 

1.990839294675034E  -  03 

1.559702646731387E  -  02  5.6256576 18206073E  -  02 

1 .9004127950359801?  -  03 

1.5597026467313871?  -  02 

1.9004127950359801?  —  03 

1.79767212536852LE  -  02  4.02451375212401QE  -  01 

4.4983658088174511?  —  03 

1.797672125368521 E  -  02 

4.49836580881 7451E  -  03 

1.7124245353889311?  -  02  2.436547020108285E  -  01 

3.4787194602747191?  -  03 

1.71 242453538893 IE  —  02 

3.4787194602747191?  -  03 

2.288340534658187E  -  02  1.653895856145327E  -  01 

4.1023990367239531?  -  03 

2.288340534658187E  -  02 

4.1023990367239531?  —  03 

3.273759728776665E  -  02  9.930187449584690E  -  02 

4.0217615497441621?  —  03 

3.273759728776665E  -  02 

4.021761549744162E  — 03 

3.382101234234097E  -  02  3.084783330690550E  -  01 

6.033 1 64660795066E  -  03 

3.382101234234097E  —  02 

6.0331646607950661?  -  03 

3.554761446001525E  -  02  4.606683185921 130E  -  01 

3.9462903021295981?  -  03 

3.554761446001525E  -  02 

3.9462903021295981?  — 03 

5.053979030686655E  -  02  2.188152994539297E  -  01 

6.6440445376802681?  -  03 

5.053979030686655E  -  02 

6.6440445376802681?  -  03 

5.701471491573222E  -  02  3.79209551560274LE  -  01 

8.2543058560784581?  -  03 

5.701471491573222E  -  02 

8.2543058560784581?  -  03 

6.415280642120340E  -  02  1.429608194181854E  -  01 

6.49605663340641  IE  -  03 

6.415280642120340E  -  02 

6.4960566334064111?  -  03 

8.0501 14828762564E  -  02  2.837312821059250E  -  01 

9.252778144146602E  -  03 

8.0501 14828762564E  -  02 

9.2527781441466021?  -  03 

1.043670681345305E  -  01  1.967374410044408E  -  01 

9.164920726294278E  —  03 

1.043670681345305E  -  01 

9.164920726294278 E  -  03 

1.138448944287513E  -  01  3.558891412116621E  -  01 
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0 

Weights  ( Wi ) 

Nodes 

■^1  Ai 

1.1569524628097675  -  02 
1.1569524628097675-02 
1.1761116467609175-02 
1.1761116467609175  -  02 
1.3824702182165405  —  02 

1.1384489442875135-01  - - 

1.4536348771552385  -  01  2.5981868535191155  -  01 

1.4536348771552385  -  01 

1.8994565282197885  -  01  3.2192318123129845  -  01 

1.899456528219788E  -  01 

Table  2:  Quadratures  on  Triangle  of  Orders  0  =  5, 10, 20  and  30. 

Conversion  from  the  parameters  {A*}  to  {i*i,  u2,it3}  is  defined  by  the  following  rules: 

•  No  A  parameter  :  ui  =  u2  =  u3  =  ^; 

•  One  parameter  Ai  :  m  =  Xl,  u2  =  u3  =  ■ 

•  Two  parameters  A1?  A2:  ui  =  Xuu2  =  A2,u3  =  1  -  Aj  -  A2. 

The  Cartesian  coordinates  x,y  of  each  quadrature  node  in  the  orbit  specified  by  {Ai}  may 
be  obtained  from  any  permutation  of  {u1:  u2,  u3}  using  formulae  (18)  and  (19). 


5.4  Accuracy 


Order  (O) 

Nodes  (N) 

Error 

Order  ( O ) 

Nodes  (N) 

Error 

1 

1 

0 

16 

54 

7.2858395-16 

2 

1 

0 

17 

58 

8.7210515-16 

3 

3 

1.6653355-16 

18 

66 

5.3082545-16 

4 

6 

2.0816685-16 

19 

73 

8.6652675-16 

5 

7 

2.0816685—16 

20 

82 

1.0814925-15 

6 

12 

2.7755585-16 

21 

85 

7.4062115-16 

7 

12 

2.914335 JS1  — 16 

22 

93 

7.4062115-16 

8 

15 

6.2450055  — 16 

23 

100 

1.2560125-15 

9 

16 

2.0816685-16 

24 

106 

1  1.0139465-15 

10 

19 

4.293441  5- 16 

25 

118 

1.2425045-15 

11 

25 

4.2934415-16 

26 

126 

7.2367885-16 

12 

28 

4.2934415-16 

27 

138 

1.0703115-15 

13 

36 

6.3143935-16 

28 

145 

1.3624105-15 

14 

40 

5.4643795—16 

29 

154 

1.0572505-15 

15 

46 

6.6770555-16 

30 

184 

1.0873045-15 

Table  3:  Errors  of  Quadrature  Rules  for  Triangles 

We  test  each  quadrature  of  order  O  on  all  monomials  in  set  M(O);  the  maximum 
absolute  error  for  each  quadrature  is  listed  in  Table  3.  These  results  were  obtained  with 
calculations  of  double  precision  accuracy. 
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5.5  Efficiency 


Order  (O) 

Efficiency  E(%) 

Triangle 

Tensor  Product 

2 

100.0 

100.0 

3 

66.7 

50.0 

4 

55.6 

83.3 

5 

71.4 

55.6 

6 

58.3 

77.8 

7 

77.8 

58.3 

8 

80.0 

75.0 

9 

93.8 

60.0 

10 

96.5 

73.3 

11 

88.0 

61.1 

12 

92.9 

72.1 

13 

84.0 

61.9 

14 

87.5 

71.4 

15 

87.0 

62.5 

16 

84.0 

70.8 

17 

87.9 

63.0 

18 

86.4 

70.3 

19 

86.8 

63.3 

20 

85.4 

70.0 

21 

90.6 

63.6 

22 

85.4 

69.7 

23 

92.0 

63.9 

24 

94.3 

69.4 

25 

91.8 

64.1 

26 

92.9 

69.2 

27 

91.3 

64.3 

28 

93.3 

69.0 

29 

94.2 

64.4 

30 

84.2 

68.9 

Table  4:  Efficiency  of  Triangle  Rules  and  Tensor  Product  Rules 

In  Table  4,  we  list  the  efficiency  of  each  quadrature  rule  of  orders  2  through  30,  and  that 
of  the  corresponding  tensor  product  rules.  An  analysis  of  this  table  reveals  that  our  triangle 
quadratures  tend  to  be  more  efficient  for  higher  orders.  The  efficiency  is  comparable  to  the 
results  obtained  by  Lyness  and  Jespersen  on  their  rules,  whose  highest  order  is  twelve; 
however,  their  rules  tend  to  be  more  efficient  than  ours.  The  efficiency  of  our  quadratures 
are  better  than  that  of  tensor  product  rules.  For  a  tensor  product  quadrature  rule  to  be 
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of  order  O,  (rfl)  quadrature  nodes  are  needed,  yielding  an  efficiency  E  =  — ^ 9  +  1)/2 

3  •  |f]2 

which  asymptotically  approaches 

6  Conclusions 

We  have  presented  a  numerical  scheme  combining  simple  group  theory  and  brute- force 
optimization  to  reduce  the  dimensionality  of  the  nonlinear  system  used  to  derive  quadrature 
rules.  With  this  scheme,  we  obtain  quadratures  with  orders  up  to  30. 

This  scheme  is  readily  extensible  to  other  symmetric  regions  in  R 2,  and  to  higher  di¬ 
mensions;  one  simply  has  to  replace  D$  with  the  corresponding  symmetry  groups. 

The  principal  drawback  of  this  scheme  is  that  a  significant  amount  of  human  intervention 
is  involved  in  choosing  initial  points  and  adjusting  the  simulated  annealing  constants.  A 
more  systematic  procedure  would  be  much  desirable.  Also,  by  requiring  quadrature  rules 
symmetric  to  the  largest  subset  of  the  symmetry  group  of  the  integration  region,  some 
highly  efficient  quadratures  may  be  missed  by  our  method.  Such  cases  are  observed  during 
our  experiments  on  the  triangle;  an  example  is  that  our  scheme  will  fail  to  find  a  5-point 
rule  that  has  an  order  of  4. 
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1  Introduction 

Integral  equations  of  classical  potential  theory  are  a  tool  for  the  solution  of  the  Laplace  equa¬ 
tion,  they  have  straightforward  analogues  for  many  other  elliptic  partial  differential  equations 
(PDEs).  Prom  the  point  of  view  of  a  modern  mathematician,  they  are  relatively  simple  objects. 
Indeed,  a  second  kind  integral  equation  (SKIE)  is  a  sum  of  the  unity  operator  and  a  compact 
operator;  for  most  practical  purposes,  such  an  object  behaves  like  a  finite-dimensional  system  of 
linear  algebraic  equations,  with  the  Fredholm  alternative  replacing  the  theory  of  determinants. 
Integral  equations  of  the  first  kind  (FKIEs)  are  a  considerably  more  complicated  object  than 
those  of  the  second  kind.  Since  a  first  kind  integral  operator  is  compact,  solving  a  first  kind  in¬ 
tegral  equation  involves  the  application  of  the  inverse  of  a  compact  operator  to  the  right-hand 
side,  depending  on  the  right-hand  side,  the  result  might  or  might  not  be  a  function.  Since 
the  classical  boundary  value  problems  (Dirichlet,  Neumann,  and  Robin)  are  easily  reduced  to 
SKIEs,  the  original  creators  of  the  potential  theory  simply  ignored  the  FKIEs.  Later,  FKIEs 
of  classical  potential  theory  have  also  been  investigated,  and  are  now  a  fairly  well-understood 
object. 

In  a  nutshell,  when  the  solution  of  a  Dirichlet  problem  is  represented  by  the  potential  of  a 
single  layer,  the  result  is  an  FKIE;  when  the  solution  of  a  Dirichlet  problem  is  represented  by 
the  potential  of  a  double  layer,  the  result  is  an  SKIE.  When  the  solution  of  a  Neumann  problem 
is  represented  by  a  single  layer  potential,  the  result  is  an  SKIE;  and  when  the  solution  of  a 
Neumann  problem  is  represented  by  a  double  layer  potential,  the  result  is  not  a  classical  inte¬ 
gral  equation,  but  rather  an  integro-pseudodifferential  one  (in  computational  electromagnetics, 
this  particular  object  is  known  as  a  hypersingular  equation).  Once  the  integral  equation  is 
constructed,  the  question  arises  whether  it  has  a  solution,  whether  that  solution  is  unique,  etc. 
Generally,  questions  of  this  type  are  easily  answered  for  the  Laplace  and  Yukawa  equations, 
and  less  so  in  other  cases. 

As  a  computational  tool,  SKIEs  were  popular  before  the  advent  of  computers;  between 
1950  and  1970,  they  were  almost  completely  replaced  with  Finite  Differences  and  Finite  Ele¬ 
ments.  The  only  areas  where  integral  equations  survived  as  a  numerical  tool  were  those  where 
discretizing  the  whole  area  of  definition  of  a  PDE  is  impractical  or  very  difficult,  such  as  the 
radar  scattering  and  certain  areas  of  aerodynamics.  The  reasons  for  this  lack  of  favor  have  to 
do  with  the  fact  that  discretization  of  most  integral  equations  of  potential  theory  leads  to  dense 
systems  of  linear  algebraic  equations,  while  the  Finite  Elements  and  Finite  Differences  result  in 
sparse  matrices  (hence  the  name  “Finite  Elements” ) .  Dinring  the  last  15  years  or  so,  it  has  been 
discovered  that  many  integral  operators  of  potential  theory  can  be  applied  to  arbitrary  vectors 
in  a  fast  manner  (for  a  cost  proportional  to  n  for  the  Laplace  and  Yukawa  equations,  and  for 
a  cost  proportional  to  n  •  log(n)  for  the  Helmholtz  equation,  with  n  the  number  of  nodes  in  the 
discretization  of  the  integral  operator).  Detailed  discussion  of  such  numerical  issues  is  outside 
the  scope  of  this  paper,  and  we  refer  the  reader  to  [5,  6],  Here,  we  remark  that  the  interest 
in  integral  formulations  of  problems  of  mathematical  physics  has  been  increasing,  and  that 
classical  tools  of  potential  theory  turned  out  to  be  insufficient  for  dealing  with  many  problems 
encountered  in  practice. 
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Specifically,  many  applications  lead  to  integral  formulations  involving  not  only  integral 
equations,  but  also  integro-pseudodifferential  ones.  More  frequently,  while  it  is  possible  to  for¬ 
mulate  a  problem  as  an  FKIE  or  an  SKIE,  the  numerical  behavior  (stability)  of  the  resulting 
schemes  leaves  much  to  be  desired.  In  such  cases,  it  is  sometimes  possible  to  reformulate  the 
problem  as  an  integro-pseudodifferential  equation  with  drastically  improved  stability  proper¬ 
ties  (perhaps  after  an  appropriate  preconditioning).  A  simple  example  of  such  a  situation  is 
t  e  exterior  Neumann  problem  for  the  Helmholtz  equation,  where  the  classical  SKIE  has  so- 
called  spurious  resonances,  coinciding  with  those  for  the  interior  Dirichlet  problem  on  the  same 
surface,  and  having  nothing  to  do  with  the  behavior  of  the  exterior  Neumann  problem  being 
solved.  The  so-called  “combined  field  equation”  solves  the  problem  of  spurious  resonances  at 
the  expense  of  replacing  an  integral  equation  with  an  integro-pseudodifferential  one  (see.  for 
example,  [1,  12,  14,  17,  20]).  Other  examples  of  such  situations  include  problems  in  scattering 
theory,  in  computational  elasticity,  in  fluid  dynamics,  and  in  other  fields. 

In  this  paper,  we  investigate  in  detail  the  analytical  structure  of  the  integro-pseudodiffer- 
ential  equations  obtained  when  Neumann  problems  are  solved  via  double  layer  potentials,  when 
Dirichlet  problems  are  solved  via  quadruple  layer  potentials,  when  Neumann  problems  are 
solved  via  quadruple  layer  potentials,  and  in  several  other  cases  (see  (11)  -  (29)  in  Section  2 
for  a  detailed  list).  It  turns  out  that  the  analytical  structure  of  the  obtained  equations  is 
quite  simple,  and  involves  several  standard  pseudodifferential  operators  (derivative,  Hilbert 
transform  derivative  of  Hilbert  transform,  inverse  of  the  derivative  of  the  Hilbert  transform,  and 
the  second  derivative),  composed  (from  the  left  or  the  right)  with  simple  diagonal  operators.  We 
also  show  that  the  product  of  the  standard  hypersingular  integral  operator  with  the  standard 
first  kind  integral  operator  of  classical  potential  theory  is  a  second  kind  integral  operator-  in 

speaLng3^8'  ^  °perat0rS  are  perfect  Preconditioners  for  each  other,  asymptotically 

Thus,  the  purpose  of  this  paper  is  detailed  analytical  investigation  of  integro-pseudodiffer- 
ential  operators  converting  the  densities  of  charge,  dipole,  quadrupole,  and  octapole  distribu¬ 
tions  on  a  smooth  curve  in  1R2  into  the  potential,  normal  derivative  of  the  potential,  second 
normal  derivative  of  the  potential,  and  third  normal  derivative  of  the  potential  on  that  curve. 

It  turns  out  that  each  of  these  operators  is  a  sum  of  a  standard  operator  (obtained  by  replacing 
the  curve  with  a  circle),  an  integral  operator  with  a  smooth  kernel,  and  a  diagonal  operator. 
Once  such  expressions  are  obtained,  it  is  quite  easy  to  construct  discretizations  of  the  underly¬ 
ing  integro-pseudodifferential  operators  that  are  adaptive,  stable  and  of  arbitrarily  high  order. 
Such  discretizations  (and  resulting  PDE  solvers)  have  been  constructed  and  will  be  reported 
m  a  sequel  [10]  to  this  paper. 

Remark  1.1  While  the  results  reported  here  are  easily  generalized  to  three  dimensions,  it 
should  be  pointed  out  that  there  exist  important  classes  of  problems  in  three  dimensions  lead¬ 
ing  to  mtegro- differential  equations  that  are  outside  the  scope  of  this  paper.  Specifically,  when 
frequency- domain  equations  of  electromagnetic  scattering  are  reduced  to  integral  equations  on 
the  boundary  of  the  scatterer  ( yielding  the  so-called  Stratton- Chew  equations),  the  resulting 
integro-pseudodifferential  operators  are  of  a  type  not  investigated  here  (in  addition  to  normal 
derivatives  on  the  boundary,  they  involve  tangential  derivatives);  similarly,  integral  equations 
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of  elastic  (as  opposed  to  acoustic)  scattering  lead  to  integral  expressions  whose  analysis  is  not 
a  straightforward  extension  of  that  presented  in  this  paper.  Needless  to  say,  such  operators  are 
freguently  encountered  in  applications, ■  they  are  currently  under  investigation. 

The  structure  of  this  paper  is  as  follows.  In  Section  2,  we  list  the  identities  that  are  the 
purpose  of  this  paper;  the  remainder  of  the  paper  is  devoted  to  proving  these  identities.  In 
Section  3  the  necessary  mathematical  preliminaries  are  introduced.  In  Section  4  we  present 
proofs  of  some  of  the  results  formulated  in  Section  2;  when  the  proofs  of  several  results  are 
almost  identical,  we  only  prove  one  of  them.  Finally,  in  Section  5  we  briefly  discuss  extensions 
of  results  of  this  paper  to  three  dimensions,  and  to  boundary  conditions  other  than  Dirichlet, 
Neumann,  and  Robin. 

Remark  1.2  The  principal  purpose  of  this  paper  is  to  present  the  explicit  formulae  (50)  - 
(68),  (89)  -  (93),  (94)  -  (99),  (100)  -  (107),  to  be  used  in  the  design  of  numerical  tools  for  the 
solution  of  partial  differential  equations.  The  proofs  of  these  formulae  in  Section  4  below  are  a 
fairly  standard  exercise  in  classical  analysis,  provided  here  for  the  sake  of  completeness.  The 
authors  expect  that  many  readers  will  find  it  unnecessary  to  read  this  paper  beyond  Section  2. 


2  Statement  of  Results 

2.1  Notation 

We  will  be  considering  Dirichlet  and  Neumann  problems  for  Laplace’s  equation  in  the  interior 
or  the  exterior  of  an  open  region  9  bounded  by  a  Jordan  curve  7 (t)  =  (xx  (t),x2(t))  in  IR2  where 
t  €  [0,  L\.  We  will  assume  that  7  is  sufficiently  smooth,  and  parametrized  by  its  arclength.  The 
image  of  7  will  be  denoted  by  T,  so  that  89  =  T.  For  a  vector  y  =  (yuy2)  <=  IR2  We  will  denote 
its  Euclidean  norm  by  ||y||.  Further,  c(t)  will  denote  the  curvature,  and  N1(t)  or  simply  N(t), 
the  exterior  unit  normal  to  T  at  7 (t).  Clearly, 

N(t)  =  (x'2(t),-x  i(t));  (1) 

the  situation  is  illustrated  in  Fig.  1. 

A  charge  of  unit  intensity  located  at  the  point  xq  £  ]R2  generates  a  potential,  4>Xo  :  ]R  \ 
{xq}  —>■  IR,  given  by  the  expression 


$*o(z)  =  -lQg(ll*-*oll),  (2) 

for  all  x^xq.  Further,  the  potential  of  a  unit  strength  dipole  located  at  x0  £  IR2,  and  oriented 
in  the  direction  h  E  IR2,  ||/i||  =  1,  is  described  by  the  formula 


$10,  h  (z) 


(/t,X  -  Xq) 

||x  —  Soil2  ' 


(3) 


As  is  well  known,  the  potential  due  to  a  point  charge  at  x0  E  IR2,  defined  by  formula  (2),  is 
harmonic  in  any  region  excluding  the  source  point  x0. 
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Figure  1:  Boundary  value  problem  in  ]R2. 


Definition  2.1  Suppose  that  a  :  [0,  £]  -+  ]R  is 
the  functions  p^  c  :  ]R2  — >  ]R  and  p*>0.,p2jCr,p2jCr 


an  integrable  function.  Then  we  will  refer  to 
:  IR2  \  T  — >  1R,  given  by  the  formulae 


(*)  = 

J0  (t)(x)  cr(t)  dt , 

(4) 

,(x)  = 

Jo  dN(t)  alt'dt’ 

(5) 

(x)  = 

Jo  dN(t)2  ® dt  ’ 

(6) 

(x)  = 

fL  ^3^7(*)(X)  /.x  j. 

Jo  dN(t)  3 

(7) 

as  the  single,  double,  quadruple  and  octuple  layer  potentials,  respectively. 

Remark  2.1  The  functions  at? /V  -  d^-Kt)  .  -ip2  \  !~.(+\\  ,  wj  ±.  ,  , 

J  dN(t)  >  dN(tp  '  a/v(t)3  •  \  \7(t)}  — >  JR  are  o/ten  referred  to  as 

the  dipole-,  quadrupole-  and  octapole  potentials,  respectively.  Obviously, 


a$7(t)(x) 

_  (N(t),x  —  j(t)) 

dN(t) 

||x-7(<)||2  ’ 

(8) 

S2%m{x) 

.  2(N(t),x  -  j(t))2  1 

ON  it)2 

llx-7WII4  llx  -  7OOII2  ’ 

(9) 

a3*7(t)(x) 

_  S(N(t),x  —  -y(t))z  6  {N(t),x  —  7(f)) 

<W(t)3 

||x-7(t)||6  ||x-7(t)||4  ' 

(10) 

Clearly,  the  potentials  p\c,  p2a,  pzg  are  analytic  in  the  interior  of  Cl  for  any  integrable 
J.-  Hi0Wiever’  for  sufficiently  smooth  a  and  7,  they  can  be  extended  to  Q  as  smooth  functions, 
xmuarly,  the  potentials  p\a,  p2c,  pzg  are  analytic  functions  in  the  exterior  JR2  \  H  of  Cl, 
and  can  be  extended  as  smooth  functions  to  JR2  \  Cl.  Furthermore,  the  normal  derivatives 
o  t  ese  potentials  also  can  be  extended  up  to  the  boundary  as  smooth  functions.  Needless 


to  say.  the  interior  and  exterior  extensions  do  not  necessarily  agree  on  the  boundary  T  (with 
the  obvious  exception  of  p°){r(x)),  and  we  introduce  the  functions  pi^,  pl'\  ■  ,  pl'°a  ^  p2'°a 

Pj,a,ei  P^,  a, 'v  P~/fcr,e  >  P~i\cr,v  P7,  ct,  e >  P^\a,'V  P^’.j-.e)  P^fo-.i’  P^’/cr,  ei  P^\  v  P7  ^  e-  ii  P7  ^  e-  P7  a  j* 
p°;3ff>e:  [0,L]-*]R  via  the  formulae 

=  JQ  ^j(t)(l{s))  a(t)  dt , 

-  Km  fLd*TWM*)-h-N(3)) 

'  l-So/o  - awl - <T<()c“' 

-  f1-  3$7(t)(7(*) +  &■#(*)) 

)  y0  P7uv+\ 

lim  f 
h—*0  Jo 


p?: 

°,to 

# 

■» 

pbo 

•^7>  CT, 

,.w 

^2,0 

^7,c- 

,iW 

^  to 

V° 

e(s) 

p3-0 

*7>  «r, 

» 

CO 

e(5) 

p0,1 

|(«) 

^0,1 

•w 

-1,1 
^7>  CTj 

iW 

p1’1 
^7,  <7,  < 

.w 

-.2,1 

^7,^, 

i(5) 

p2’1 

^7,C7,( 

s(«) 

„0,2 
^7,  O’* 

i(s) 

p0>2 

^7,<M 

,w 

h—to  Jo  dN(t) 

L  d2$y(t)(7(s)-h-N(s)) 
dN{t) 2 


cr(t)  dt , 


lim 
h — >0 

, 1  02*„,)(7«  +  *  •*(»))  . 

fcu  - ajvfip - "W*- 


lim  [L&*T$M?)  +  h  ■*(*)) 


l 


dN{t) 3 


cr(f)  dt , 


to  fl  •*(»)) 

fcWo  dN(s)  dN(t)  {  )  ’ 

h->oJo  dN(s)  dN(t)  Wdf’ 

a-So/o  dN(s)dN(t)2 

to  f L  gji^rW  +  AJVM) 

Wo  dN(s)dN(t) 2  (Jdt’ 


^-*•0 


rL  f&mh (j)-h-N(s)) 


lim  [  - ^*,v  ' - - -  a(t\  dt 

h-yoJo  dN{s )2  W  ’ 

/•ia2$7(t)(7(s)  +  /l.fV(s)) 

nio  — - CTWd<’ 


(11) 

(12) 

(13) 

(14) 

(15) 

(16) 

(17) 

(18) 

(19) 

(20) 
(21) 

(22) 

(23) 

(24) 

(25) 
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*.'w  = 


L 


■L&*T{tMs)-h-N(s)) 


cr(t )  dt , 


&/v(s)2a/v(t) 
pft.M  =  lim  fL—^MSl  +  h'NMK(t)dt 

h—>o  Jo  dN{s)2dN(t ) 

ro,3  fs  _  [L&*i(t)h (s)-h-N(s)) 

-  lim  yo  ^53 - 

#.w  =  lim  /Lg!^i!)(2 Miigy,]* 

7,1  A-*oy0  aiv(5)3  °wdf- 


(26) 

(27) 

(28) 
(29) 


Remark  2.2  Throughout  the  paper,  the  subscripts  “i”  and  “e”  will  denote  the  limits  from 
the  interior  and  the  exterior  towards  the  boundary,  respectively.  Furthermore,  the  superscripts 

i,j  (as,  for  example,  m  p^a<e(s))  refers  to  i  times  and  j  times  differentiation  with  respect 
to  N{t)  and  N($),  respectively. 

Definition  2.2  Suppose  that  the  function  a  :  [0,L]  ->  TR  is  twice  continuously  differentiable, 
and  that  7  is  sufficiently  smooth.  Then  we  define  the  operators  K°  K1,0  K 1,0  ft'2,0  fC2<° 

K%- &  <?,  $  :  C^M]  : 
c[0,  L]  via  the  formulae  1  7’  7’  7’  L  J 


K°(a)(s)  -  p°fff(s)  -  j  $7(t)(7(s))  a(t)  dt , 

»  0 


0)M  =  p™  ,(,)  =  lim  f  W-X‘) (7W  ~  *  •  g(t)  it 
7’  7,<M  /i^oy0  3iV(t)  cr[Z)az, 

y(s)  +  h 
dN(t ) 


K}>)M  =  P#  .(«)  -  lim  [L  HihBhkl  +  A  '  jgM)  g(<)  J, 

7’  ,e  h-tOJo  flATY^ 


^SW(«)  =  ;(g)  =  lim  [ L  — *7(t)  (7J3)  **  '  a(t)  dt 

7’  ’  h-yOJo  dN(t)2  CT{l)ai, 

*£°.WW  =  pf,  ,M  =  lim  [L  g?*-rW(T(3)  +  ft  ■  ATM) 

7’  ’  a-Wo  3iV(^)2 

-  #«.< » -  fa  £  '  "(5>)  «*>  * . 


dN(t)3 


*»«>  -  e,w-as 


(30) 

(31) 

(32) 

(33) 

(34) 

(35) 

(36) 

(37) 

(38) 

(39) 
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«4:iww 

<’!ww 

<iww 

<iww 

<!ww 

*?SMM 


pi, 


J* 

p: 

i 

P. 

j 

P- 


P. 

P. 

i 

P. 

_i 

p. 


r,£r,e  Lo/o  ajv(a)  ajv(*> 

,2,1  (s)ssUm  fL^%m( M/u 
7,<T’‘  ^  iWo  5iv(s)a^(t)2 

,2,1  (5)  =  lim  (L  a3$7(t)(7(g)  +  /i-iV(^)) 

7,£r,e  nio  diV(s)  dJV(i)2 

P,2  H_liTn  fLd2%(tMs)-h-N(s)) 

^i(S,"teio  5AW - ffW*’ 

>°,2  (a)  =  lim  [ L  M 

7,<r,e  '  l“o7o  S7V(s)2 

7,<T’1  i^oy0  ayv(s)2  a/v(t) 

**  w  .  lim  f  »«W7 (.)  +  *•*(.)) 

7,1  a-*o,/o  aiv(s)2  ajv(t)  w  ’ 

?0’3  fa)  =  lim  (L  g3$7(t)(7(^)  -h-  N{s)) 

7,<M  h^oJo  dN(s )3 


'7^iw"nsy0 - - CTW 

e . W  =  lim  fl  83^(,)(7 M  +  fcjn;))  ,  , 
7,<T,e '  ^0y0  &/v(s)3  1 J 


(40) 

(41) 

(42) 

(43) 

(44) 

(45) 

(46) 

(47) 

(48) 


Remark  2.3  Obviously,  the  operators  K°%  K«%  K°%  K°%  K°fi;  K%,  K#  given  by 
the  formulae  (37),  (38),  (43)  -  (48)  are  the  adjoints  of  the  operators  K*’^,  K^°e,  K 2'°i,  K2, °e, 

Ky,i>  Ky]e>  Kj ti>  -^7’e  defined  by  (31)  -  (36),  (41),  (4%),  respectively.  Furthermore,  K~,  K1’1 , 
Ki,\  defined  by  (30),  (39),  (40)  are  self-adjoint.  7’” 


2.2  Physical  Interpretation 

Formulae  (30)  -  (48)  have  simple  physical  interpretations.  Specifically,  K®  is  the  linear  operator 
converting  a  charge  distribution  on  the  curve  T  into  the  potential  of  that  charge  distribution 
on  F.  The  operator  converts  a  dipole  distribution  on  T  into  the  potential  created  by  that 
distribution  on  the  inside  of  T;  the  operator  converts  a  dipole  distribution  on  T  into  the 
potential  created  by  that  distribution  on  the  outside  of  T.  The  operator  K°’\  converts  a  charge 
distribution  on  T  into  the  normal  derivative  of  the  potential  created  by  that  distribution  on 
the  outside  of  T,  etc. 

Generally,  the  first  superscript  denotes  the  number  of  differentiations  at  the  source  (charges, 
dipoles,  quadrupoles,  or  octapoles);  the  second  superscript  denotes  the  number  of  differentia¬ 
tions  at  the  point  where  the  potential  is  evaluated  (potential,  normal  derivative  of  the  potential 
second  normal  derivative  of  the  potential,  third  normal  derivative  of  the  potential).  In  agree¬ 
ment  with  standard  practice  in  the  theory  of  pseudodifferential  operators,  we  will  define  the 
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order  k  of  either  of  the  operators  and  K^e  by  the  formula 

k  =  i  +  j  -  1,  (49) 

and  observe  that  in  this  paper,  we  describe  in  detail  all  operators  of  potential  theory  whose 
order  does  not  exceed  2.  For  example,  we  do  investigate  the  operator  ?,  converting  a  dipole 
distribution  on  T  into  its  second  normal  derivative,  but  we  do  not  investigate  the  operator  K~'~- 
converting  a  quadrupole  distribution  on  F  into  its  second  normal  derivative. 

An  examination  of  formulae  (50)  -  (68)  shows  that  the  complexity  of  the  expressions  de¬ 
scribing  the  operators  (30)  -  (48)  on  the  circle  hardly  increases  as  the  order  of  the  operator 
grows.  On  the  other  hand,  the  differences  between  the  operators  (30)  -  (48)  on  the  circle 
and  those  on  an  arbitrary  curve  become  more  complicated  with  the  growth  of  the  order  of 
the  operator.  For  example,  the  operators  K°,  K°'\  on  an  arbitrary  smooth 

curve  always  differ  from  these  operators  on  the  circle  by  a  compact  operator  (see  formulae  (89) 
-  (93)).  Similar  differences  for  the  operators  K*’°,  K% °e,  K^),  K^\,  K°fe  involve  the 
curvature  of  7  (see  (94)  -  (99)).  For  the  operators  K^,  K*%  K*’},  AT2;*,  K]%  K°’j, 

■^7,’e>  the  corresponding  formulae  (100)  -  (107)  already  involve  the  square  and  the  derivative 
of  the  curvature,  as  well  as  the  Hilbert  transform  of  the  function. 


Remark  2.4  While  it  is  certainly  possible  to  derive  explicit  expressions  for  boundary  integral 
operators  of  orders  higher  than  2,  the  complexity  of  the  resulting  formulae  grows,  while  their 
numerical  utility  decreases.  The  authors  have  chosen  to  draw  the  line  at  the  order  2,  mostly 
because  in  the  applications  they  anticipate,  order  1  is  sufficient. 

Remark  2.5  While  many  of  the  facts  presented  in  this  paper  can  be  obtained  “automatically” 
from  the  standard  theory  of  pseudodifferential  operators,  the  purpose  of  this  paper  is  to  provide 
the  explicit  expressions  (50)  -  (68)  to  be  used  in  numerical  calculations.  Thus,  we  are  ignoring 
the  connections  between  the  formulae  (50)  -  (68),  (89)  -  (93),  (94)  -  (99),  (100)  -  (107),  and 
the  more  general  theory  of  pseudodifferential  operators. 


2.3  Results 

The  limits  (12),  (13),  (18),  (19)  have  been  studied  in  detail  in  the  literature  (see,  for  example, 
[13,  11]).  In  Section  4,  we  conduct  a  similar  investigation  of  (14)  -  (17),  (20)  -  (29);  first  for  a 
circle,  and  then  for  a  sufficiently  smooth  Jordan  curve.  The  following  theorem  provides  explicit 
expressions  for  the  operators  (30)  -  (48)  on  the  circle. 

Theorem  2.6  Suppose  that  7  is  a  circle  of  radius  r  parametrized  by  its  arclength  with  the 
exterior  unit  normal  denoted  by  N,  k  is  an  arbitrary  integer,  and  s  e  [— 7rr,7rr].  Then, 

(a)  K°(eikt'r)(s)  =  p°’°eikt/T(s)  =  [**  $^(7(3))  eM'r  dt 

J—TTT 

_  f  r\k\~lrelks/T ,  forfc^O, 

I  -2  7r  r  log(r) ,  for  k  =  0 , 
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"  ^T(„(7(S)-A.JV(S)) 


aj\T(t) 


e  t/r  d< 


— 7r  elfcs/r ,  for  A;  7^  0 , 
— 2  7r ,  for  k  =  0 , 


Kyi(eikt/r)(s)  =  plfiikt/,  (»)  ~  r  +  h  •  N(s)) 

7,ev  ^,e^/re\  )  h^QJ_^  aJNT(*) 


7r  elfci/r  ,  for  A;  7^  0 , 
0 ,  for  A:  =  0 , 


=  lim  r  ^ 

T,’e  h-*oJ_TT  dN(t)2  at 


'  ^Toy_.r  aiv(<)2 

7r  (|A:|  +  1)  r-1  e**4/r  ,  forA^O, 

2  7r  r_1 ,  for  A:  =  0 , 


ir72;°e(e^)(S)  = 


W  =  lim  r  ?*,C)(7 W  +  fc-WM)^ 

T',e  /’e  h-^Qj^r  9N(t)2 


■e’ ,T’e  ajv(f)2 

7r  (|/c|  —  1)  r-1  eifcs/r  ,  for  A:  t^O, 

0 ,  for  A:  =  0 , 


<’i(eifct/r)(S)  = 


p^eikt/Ti(s)=hm  -  j-wv;v:l3  M ; 

7,e  ,1  h-40J-„r  9N(t)2 

f  -7T  (|fc|  +  1)  (|A:|  +  2)  r-2  e^/r ,  for  A;  #  0 , 
1  ~47rr-2,  for  A:  =  0 , 


:fct/r  d< 


^3,0(eiA:t/r)(s)  _  3,0  /  x  _  R  ^  •  iV(s))  ftf/ 

7,eV  A;  "'y,elkt/T,e''S '  ^0J.„r  «  6  * 


h^0j_vr  dN(t) 3 

7T  (|A|  -  1)  (|fc|  -  2)  r-2  e^/r  ,  forA:  #  0 , 
0  >  for  A:  =  0 , 


=  »$W«>  -  Lm,  £  ' JVW)  ^  * 


7T  etfci/r ,  for  A:  7^  0 , 
0 ,  for  A:  =  0 , 


Aw.(e^)W  =  -  um£  ?*><yM  +  k ; m e*„rdt 


-IT  elks/T  ,  for  A:  7^  0 , 
— 2  7r ,  for  A:  =  0 , 


9 


KQ^/'Ks)  = 

tf?;2.(e“,/r)(s)  = 

K%(eikt/')(s)  = 


M  =  lim  r  ggzall ^±±m.^rA 

T,e  A-W-w  dN(s)dN(t)  dt 

/  -7 r  JAr|  r_1  e**i/r ,  for  A;  #  0 , 

{  0,  for  A;  =  0 ,  (59) 

7’£  -'  J  UJ.„  6N(s)dN(t)  d 

/  -7 r  |&|  r_1  e’*i/r  ,  for  A:  ±  0 , 

1  0,  for  A:  =  0 ,  (6°) 

p2,1  ,k .  .(s) = iim  r 

7>e  /-1  ^o7_ffr  5AT(s)  div(*)2  dt 

I  ™  1^1  (l&l  +  1) r_2  etfcs/r  ,  for  A:  7^0, 

\  0,  for  A:  =  0 ,  (61) 

P2J  W  =  lim  r  ^otTW  +  ^-^W)  e,tt/, * 
I.!'*1';#1  '  *-,o7_„  9N{$)dN(t)2  dt 

/  _7r  1*1  (1*1  -  1)  r~2  eiksfr ,  for  Ac  #0, 

[  0  5  for  Ac  =  0 ,  '  ' 

/2  ;W  =  lim  r  ^7(t)(7 (3)  -  A  •  ^)) 

/  tt  (|Ac|  —  1)  r_1  elks/r ,  forAc^O, 


v  0,  for  A:  =  0 ,  (63) 

P°’2,4A  (,)  =  Urn  r  + 

7)6  7 ’e  h^0j-nr  dN{s )2 

/  tt (|A:|  +  1) r-1  e**s/r ,  forA:#0, 

{  27rr_1,  for  A;  =  0 ,  ^ 

p1*  (,)  =  Um  r  ^cM’)  -  *  •jyW)„a./r 

J  A— *0 J—TTT  (>Nts)2dN(t)  dt 

/  —  tt  |fe|  (|fc|  -  l)r~2eiAs/r,  for  A:  7^  0 , 

1  0,  for  A  =  0 ,  (6°) 

-U  (si  =  lim  r  a3*AW ftM +<■•*(»))  jtl/r 
7>e  h-tOj-rr  dN(s)2dN(t)  6  dt 

I  *1*1  (|lb|  +  1)  r~2  eifc5/r ,  for  A:  7^  0 , 

1  0,  for  A:  =  0 ,  (66> 


0) 


K0'\(eikttT)(s)  =  p°’\kt/T.(s)  =  lim  T"  — ?(t)(7(a)  ~ - '  Bfll  e*t/r  df 
7’  ^  '  h->Oj-VT  dN{s )3 

_  /  7r  (|&|  -  1)  (|A:|  -  2)  r~2  elfci/r  ,  for  k  ^  0 . 

[  0 ,  for  A:  =  0 , 

»„.(,)  =  lim  ["  — ,W(^) + ,*  ’ *M)  e*1'-  a 
7,e  ,e  h-*0J-TT  dN(s)3 

=  f  — 7r  (|&|  4- 1)  (|fc|  +  2) r-2  eifcs/r ,  forfc^O, 

1  — 47rr-2 ,  for  /c  =  0 . 


(67) 


(68) 


Formulae  (50)  -  (68)  describe  the  action  of  the  operators  (30)  -  (48)  on  the  circle  for 
functions  of  the  form  elkt/r,  with  k  =  0,  ±1,±2, ...  Now,  it  immediately  follows  from  (50)  - 
(68)  that  for  any  periodic  function  o  :  [0,  L\  — >  (C  given  by  its  Fourier  series 


°(t)  =  £  cke2*ikt/L,  (69) 

k=— oo 


the  operators  (30)  -  (48)  (7  is  the  circle  of  radius  r  =  ^)  assume  the  form 


(a) 

=  -iiog(i)s„  +  |  f; 

k=  —  oc 
k^0 

_Ls.  e  2*iks/L 

1*1 * 

(70) 

(b) 

OO 

■Ky.’i^X5)  =  -27ra0-7T  $ke27riks/L 

k  =  —  OC 

k^0 

=  -ncr(s)  -ttBq, 

^{(«)(s)  =  »f  Ste2""1 

(71) 

k=-oo 

k^0 

=  Tra(s)-7T$0, 

(72) 

(c) 

«  =  — 00 
*7*0 

2n2  27T2 

=  —o(s)  +  *H(o')  (s)  +  f|-So,  (73) 

9  _2  00 

k—  —  oo 


(d)  K^(c)(s)  = 


16  7T3  ^  47T3  ^ 

— JjT  ao-jY  E  0*1  +  1)  (1*1  +  2)  3^  e27riks/L 


k  —  -oc 
k^O 


*?>)(  s)  = 


~c(s)  +  ,/(,)  -  H{a‘){s)  -~a„, 

4  7T3  °° 

7T  E  (1*1  "1)  (1*1 -2)5*  e2’**'/1- 


L2 


*:  =  — cc 
k*  0 


£,2  a(s)  - —  H{a')(s)  -  ^p-ao, 


(75) 


(76) 


(g) 


00 


(e)  <’|(a)(S)  =  7T  X; 


k  =  —  oc 

k*  0 


=  7rcr(s)  -  7ra0  , 

■^7,’e(a)(s)  =  — 27rao  — 7r  ^  ake27riks/L 


fc=  —  oo 


=  -ffff(s)  -  7TCT0, 


(f)  <'iww  = 


K’i.MM  = 


9  7 r2  00 

— f~  E  l*l^*s/i 


kz s  — oc 
*5*0 


-tt  #(</)(*), 

2tt2  00 

E  1*1 


*  =  — OO 
*5*0 


<■!(*)(«)  = 


-tt  H(a')(s) , 

br3 
L2 


4?r  E  1*1  (1*1  +  1)  &k  e2niks/L 


*=  — oo 
*5*0 


o  7 r2 

-7ra"(5)  +  — ^(a')(5), 


91  4  ^3  CO 

-^7,’e(cr)(s)  =  --ry  1*1  (1*1  —  1)  e2l7lkstL 


k~  —  oo 
*5*0 


<■?(<-)  w  = 


2  TT2 

rra"(S)  +  ^  #(</)(*), 

2  7r2  00 

~  E  (\k\-l)dke2iriks/L 

*=  —  oo 
*5*0 


(77) 


(78) 


(79) 


(80) 


(81) 


(82) 
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(83) 


(i) 


0) 


2tt 


27T2  _ 


jr  a(S)  +  7Tif(cr')(s)  +  ^ao, 

4-2  0—2  oo 

*?S(*)«  =  ao  +  ^f-  53  (|^l)5te«s/I 


T 

2tt2 


*  =  — OO 


0  IT -2 

-  L  <7(s)  +  7Ti?(cr,)(s)  +  —  5o, 

<’i(^)(s)  =  ~  E  |&|  (|*|  —  1) e27riks/L 

Ai  =  — oo 

=  ?raw(a)  +  Hl!fr(y)(5)> 

4  —3  oo 

E  W(W  +  l)5*e2,rifc,/L 

fc=  — oo 
k^O 

2  jr2 

=  — 7T  aw(s)  +  -j~  H (a1)  (s) , 

K,i(*)(s)  =  ^  (\k\-l)(\k\-2)ake2*iks'L 


k=  —  oo 


~  ~JTCJ^S)  -n°"(s) - —  H(a')(s)  -  o, 

K°Ha)(s)  =  ~L  5o-^  E  0*1  +  1)  (1*1  +  2)  5*  e2^/L 


k—  —  oc 


a(5)  +  77 a"(s)  -  ^ H (✓)«  -  ^ 9o , 


(84) 


(85) 


(86) 


(87) 


(88) 


with  Bk  denoting  the  fc-th  Fourier  coefficient  of  the  function  ct,  and  H  the  Hilbert  transform 
(see  (130)  in  Section  3.3). 

Theorem  2.6  above  is  proved  by  direct  evaluation  of  the  relevant  integrals  (in  Section  4 
below,  we  compute  these  integrals  via  the  theory  of  residues).  Formulae  (70)  -  (88)  are  an 
immediate  consequence  of  Theorem  2.6;  they  provide  explicit  expressions  for  the  operators 
(30)  -  (48)  when  7  is  a  circle. 

The  following  theorem  follows  easily  from  well-known  results  (see,  for  example,  [19,  131) 
here  stated  in  a  slightly  different  form. 


Theorem  2.7  Suppose  that  7  :  [0,L]  -»  1R2  is  a  k  times  continuously  differentiable  Jordan 
curve  parametrized  by  its  arclength,  and  that  r,  :  [0,L]  -►  ]R2  denotes  the  circle  of  radius  r. 
Then,  for  any  sufficiently  smooth  function  a  :  [0,  L]  -¥  K, 

(a)  K°(a)(s)  =  K°n(a)(s)  +  M0(a)(s),  (89) 
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(b)  *}>)(«)  =  <■?(»)(») +  M1(ff)w 


=  -7ra(s)  +  Ar1(a)(s),  (90) 

^f,eiG)(s)  =  Kr)fe(a)(s)  +  Mi(cr)(s) 

=  Trcr(s)  +  Ni(a)(s) ,  (91) 

(°>  =  <’l(a)(5)  +  Mr(<r)(s) 

=  **(«)+ JV?(oO(«),  (92) 

Ki^)is)  =  A?;l(a)(«)  +  Aif(<r)(a) 

=  -ira(s)  +  iV?(<r)(s),  (93) 


twAere  M0,Mi,Ni  :  c[0,  L]  ->  c[0,L]  are  integral  operators  with  kernels  m0(s,t)  e  c*-1([0,I]  x 
[0,1-]),  mi(5,t),  ni(s,f)  e  c*_2([0,Z,]  x  [0,1]),  respectively.  Furthermore,  M{,  Nf  are  the 
adjoints  of  M\,  N\,  respectively,  and  the  operator  M0  is  self-adjoint. 

Theorem  2.7  approximates  the  operators  K°,  K]%  K*'\,  for  an  arbitrary  smooth 

Jordan  curve  by  the  same  operators  on  the  circle;  Theorem  2.8  below  extends  these  results  to 
the  operators  (33),  (34),  (39),  (40),  (43),  (44).  While  Theorem  2.7  is  well-known,  the  authors 
failed  to  find  Theorem  2.8  in  the  literature. 

Theorem  2.8  Suppose  that  7  :  [0,1,]  ->  ]R2  is  a  k  times  continuously  differentiable  Jordan 
curve  parametrized  by  its  arclength,  and  that  r,  :  [0,1]  ]R2  denotes  the  circle  of  radius  A, 

a  so  parametrized  by  its  arclength.  Then,  for  any  sufficiently  smooth  function  a  :  [0,  L]  — >  ]R* 


(cl)  <’>)(*)  =  (7rc(s)-^-^Cr(s)  +  ^;°((T)(5)+M2(<7)(5) 

=  nc(s)  a(s)  +  n H(cr')(s)  +  N2(cr)(s) ,  (94) 

K?:°e(a)(s)  =  -(*c(s)  -  ^jr-')cr(s)  +Kjj;°e(a)(s)  +  M2(a)(s) 

=  —7rc(s)  a(s)  +  7r  H(a')(s)  +  N2(a)(s) ,  (95) 

(b)  (*)(*)  =  + 02(a)(5) 

=  +  G2{u)(s)  ,  (95) 

■^7,’e(a)(s)  =  ^,’e(cr)(s)  +  G2(cr)(s) 

=  -vH(<7f){s)  +  G2{a)(s)t  (97) 

(c)  K^{o)(s)  =  -(^c(s)-^-^a(s)  +  K°’2i(a){s)  +  M^{o)(s) 

=  —  7rc(s)  a(s)  -I-  n  H(g'){s)  +  N%(a)(s) ,  (98) 

Ky2ti°){s)  =  (^c(s)-^^(s)+i^J;2e(a)(s)  +  il^(a)(s) 

=  ft  c(s)  a(s)  +  it  H(a')(s)  +  N2(a)(s) ,  (99) 
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where  c(s )  denotes  the  curvature  of  7  at  7 (s),  and  M2,  N2,  G2  :  c[0,L]  c[0,  i]  are  integral 

operators  with  kernels  m2(s,t),  n2(s,t),  g2(s.t)  €  c*-2([0.  L] x [0,  L)),  respectively.  Furthermore. 
M2,  N2  are  the  adjoints  of  M2.  N2,  the  operator  G2  is  self-adjoint,  and  H  denotes  the  Hilbert 
transform  (see  (130)  in  Section  3.3). 


Remark  2.9  The  formulae  (90)  -  (93)  above  are  somewhat  misleading,  in  that  they  state  very 
simple  facts  in  a  relatively  complicated  manner.  Specifically,  each  of  the  operators  K K}:°e. 

■^7 ,  i :  Ky ’e  is  a  second  kind  integral  operator  with  smooth  (c^  2J  kernel  (see,  for  example.  [13]). 
In  the  case  of  the  circle,  the  kernels  of  the  operators  K$,  K^\  are  identically  equal 

to  ~b-  Thus>  (90)  ~  (93)  state  the  trivial  fact  that  the  difference  of  two  smooth  kernels  is 
smooth.  We  list  (90)  -  (93)  for  compatibility  with  the  formulae  (89),  (94)  -  (99). 


Observation  2.10  Formulae  (89)  -  (99)  have  a  straightforward  interpretation.  Specifically, 
each  of  the  operators  K°,  K)%  K%,  K*%  K\'\,  Kft,  <’?,  K°?e,  is  a  sum 

of  a  standard  operator  (the  corresponding  operator  on  the  circle)’ and  an  integral  operator  with 
a  smooth  kernel. 


In  Section  4  below,  a  proof  of  formulae  (94)  and  (95)  is  given;  the  proofs  of  the  formulae 
(94)  -  (99)  in  Theorem  2.8  are  similar  and  are  omitted.  Theorem  2.11  below  extends  the  results 
of  Theorem  2.8  above  to  the  operators  K^,  K™,  K%,  K%,  K%  KQ'\.  Its  proof 

is  virtually  identical  to  that  of  Theorem  2.8,  and  is  omitted. 


Theorem  2.11  Suppose  that  7  :  [0,  Li\  — >  IR2  is  a  k  times  continuously  differentiable  Jordan 
curve  parametrized  by  its  arclength,  and  that  7?  :  [0,1]  -»  1R2  denotes  the  circle  of  radius 
also  parametrized  by  its  arclength.  Then,  for  any  sufficiently  smooth  function  0  :  [0,  L]  -7 IR 

(a)  <■?(*)(»)  =  -(2*(cW)2-^c(s))(r(s)+(,-|cW)</'W 

-2-c'{s)Hia)U)  +  f- c(s)A'”V)(s)  +  Af, (»)(«) 

=  -2  7 r  (c(s))2cr(5)  +  7rcr"(s)  -  2t r c'(s) H(cr)(s)  -3? r c(s)  H(a')(s) 
+N3(a)(s),  (100) 

K7?e(a)(s)  =  ^2  7T  (c(s))  -  c(s)j  a{s)  -  ^7T  -  ^  c(s) j  o"{s ) 

-2tt  c'(s)H(o)(s)  +  ^c(s)K%°(a)(s)  +  M3(a)(s) 

=  27r  (c(«))2^(s)  -  7T  a"(s)  -  2  7T  c'(s)  H{a)(s)  -  3  7 r  c(s)  H(cr')(s) 
+N3(a){s),  (101) 

(b)  K%(c)(s)  =  -^-^c(s^a'\s)  +  rrc'(s)H(a)(s)  +  ^c(s)K^(a)(s) 
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-K?3(<7)(s) 

=  -7rcr',(5)  +  7rc'(s)ff(a)(s)  +  7rc(5)i7(tT')(5)  +  G!3(cT)(s),  (102) 

=  (»-  jc(5))a'/(S)  +  7rc,(5)^(a)(S)  +  ±c(s)K^\(a)(s) 

+GZ  (&)(s) 

=  7rCT/,(s)  +  ^c/(s)if(Cr)(5)  +  7rC(5)^(a,)(5)  +  G3(a)(S),  (103) 

(c)  <’?(*)(*)  =  (^~f  c(5))^(s)  +  ^c(5)^;J(cr)(5)  +  G5(a)(5) 

=  no"(s)  +  -kc{s)H(o')(s)  +  G!3(ct)(s),  (104) 

<’!ww  =  -(»-|c(s))<,"(s)  +  i-c(s)ii:i;2W(s)  +  G|(<T)W 

=  —7T<t"(s)  +  7rc(5)  H(a')(s)  +  (?3((7)(s)  ,  (105) 

(d)  <1(a)(5)  =  (2  *  (c(s))2-^  «*))*(*) -(*-±4s)y  W 

—ttc'(s)  H(a)(s)  +  j -  c(s)  K°j(a)(s)  +  Mf(a)(s) 

=  2tt(c(s))  a(s)  —  7ra"(s)  —  -k c'(s)  H(a)(s)  —  c(s)  H(a')(s) 

+N$(a){s),  .  (106) 

Kj?e(a)(s)  =  -^27T^c(s)) - —c(s)ja(s)  +  ^7r  -  ^  c(s)^  cr"(s) 

—irc'(s)  H(<r)(s)  +  c(s)  K%fe(a)(s)  +  M${a)(s) 

z  7 r 

=  -2tt(c(s))  <7(s)  +  7rcr"(s)  -irc'(s)H{cr)(s)  -  3n c(s)  H{a')(s) 
+N^(a){s),  (107) 

where  c(s)  denotes  the  curvature  of  7  at  7 (s),  and  Mz,  JV3j  Gz  :  c[0,L]  — >  c[0,Ll  are  mtearaZ 

™fkernels™3M>  nz(s,t),  g3(s,t)  €  ^([0li]x[0,L])>  respectively.  Furthermore, 
Mz>  n3,  G 3  are  the  adjoints  of  Mz,  Nz,  Gz,  and  H  denotes  the  Hilbert  transform  ( see  (130) 
in  Section  3.3). 

2.4  Computational  Observations 

In  the  numerical  solution  of  elliptic  PDEs,  one  is  often  confronted  with  the  task  of  evaluating 
some  (or  all)  of  the  operators  (30)  -  (48)  numerically.  While  this  class  of  issues  will  be  discussed 
in  detail  in  a  sequel  to  this  paper,  here  we  observe  that  an  inspection  of  the  formulae  (50)  -  (68), 
(89)  -  (93),  (94)  -  (99),  (100)  -  (107)  immediately  shows  that  each  of  the  operators  (30)  -  (48)  is 
a  combination  of  the  following:  integral  operators  with  smooth  kernels,  integral  operators  with 
the  logarithmic  singularity  on  the  diagonal,  the  Hilbert  transform,  the  derivative  of  the  Hilbert 
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transform,  and  the  second  derivative.  The  techniques  for  the  accurate  integration  of  smooth 
functions  have  been  available  for  hundreds  of  years,  and  the  numerical  evaluation  of  the  second 
derivative  presents  no  serious  problems.  Effective  techniques  for  the  numerical  evaluation  of 
the  Hilbert  transform  are  less  well-known,  but  have  also  been  available  for  many  years  (see, 
for  example,  [16]).  Efficient  integration  of  logarithmically  singular  functions  is  also  not  very 
difficult  (see  [15,  8,  2]).  The  only  possible  source  of  problems  is  the  derivative  of  the  Hilbert 
transform,  quadrature  rules  for  the  evaluation  of  the  latter  have  been  constructed,  and  will  be 
published  in  [10].  Thus,  there  exist  rapidly  convergent  schemes  for  the  numerical  evaluation 


of  all  of  the  operators  (30)  -  (48),  and,  therefore,  for  the  discretization  of  any  problem  of 
mathematical  physics  that  has  been  reduced  to  a  set  of  integro-pseudodifferential  equations 
involving  any  (or  all)  of  the  operators  (30)  -  (48). 

Of  course,  when  a  problem  of  mathematical  physics  is  discretized,  one  of  principal  issues 
is  the  condition  number  of  the  obtained  system  of  equations.  An  examination  of  the  formulae 
(51),  (57),  (52),  (58)  immediately  shows  that  the  operators  K £?,  K*fei  are  asymp¬ 

totically  well-conditioned  (being  a  sum  of  the  identity  and  a  compact  operator).  The  spectrum 
of  the  operator  K ^  decays  as  1/k  with  k  the  sequence  number  of  the  eigenvalue  (see  (50)),  and 
its  to- point  discretization  will  (asymptotically)  have  condition  number  ~  to.  Each  of  the  oper¬ 
ators  j,  iCj/j,  K%fe  has  a  spectrum  that  grows  linearly,  and  the  n-point 

discretization  of  each  of  them  will  have  condition  number  to.  Finally,  each  of  the  operators 
^7,i>  ^7,i>  *^7, i ’  ^7,’e?  ^7’ei  -^7’e  h35  a  spectrum  that  grows  as  A;2;  an  n-point  dis- 

cretization  of  any  of  them  will  have  condition  number  ~  n2.  Thus,  whenever  the  problem  to  be 
solved  results  in  the  discretization  of  any  one  of  the  operators  K°,  if2’?  K 1,1  K0’2  K2,0  K1,1 

£. 0,2  £''3,0  £.2,1  £.1,2  £.0,3  T/-Z  0  £.2  1  rs-\  2  tv-(1  ?  .  ^  "^’1  O'*'’  7,6’  7,6’ 

/v7,e>  A7,i>  A7,i’  Kf Je>  ,’e)  there  is  a  potential  for  condition  number 

problems,  similar  to  those  encountered  with  direct  discretization  of  differential  equations. 

Fortunately,  formulae  (50)  -  (68)  suggest  a  solution.  Specifically,  an  examination  of  the 
formulae  (50),  (53),  (89),  (94)  immediately  indicates  that  each  of  the  operators  K°  o  if2’?, 

■^7,i  0  -^7  *s  a  sum  multiplication  by  a  constant  with  a  compact  operator,  i.e. 


K°°K™  =  n*.I  +  M°0'20, 

with  M?ow,  A4?°'20  compact  operators  L2[0,L]  -t  L2[0,L\.  Similarly, 


A?  O  if2;°e  =  tt2  •  /  +  Me00-20, 

(110) 

Kyfe  o  if°  =  7T2  •  i  +  M20’00, 

(111) 

*— i 
r-i 

o 

o 

+ 

►-< 

CN 

1 

II 

rH 

o 

(112) 

(113) 

K‘  a  K}-\  =  -„2  ■  /  + 

(114) 
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and 


oK®  =  —7T2  •  I  +  Me11,00, 


Ki  o Kii  =  *2-I  +  M?0’02,  (116) 

<’?  =  ^  =  7r2./  +  <-00,  (n7) 

=  +  (n8) 

+  •  (HQ) 

all  of  the  operators  M;11’00,  Men>00,  Mj00’11,  Me00-U,  Mj02’00,  Me02-00,  Mf0’02,  Me00’02  are  compact. 
In  other  words,  the  operator  K°  is  a  perfect  preconditioner  (asymptotically  speaking)  for  each 
of  the  second  order  pseudodifferential  operators  of  potential  theory  in  two  dimensions;  in  turn, 
K"  is  preconditioned  by  each  of  the  operators  (94)  -  (99). 

Expressions  (100)  -  (107)  contain  the  second  derivative,  and  are,  clearly,  preconditioned 
by  the  operator  of  repeated  integration  I2  :  L2[0,L]  ->  i2(0,L],  defined  by  its  action  on  the 
functions  el'm'xlL  via  the  formula 

=  (120) 

In  other  words,  for  each  of  the  operators  (30)  -  (48),  there  is  available  a  straightforward 
preconditioner.  Numerical  implications  of  these  (and  related)  observations  will  be  discussed  in 

3  Analytical  Preliminaries 

3.1  Principal  Value  Integrals 

Integrals  of  the  form 

Li=idt'  (i2i> 

where  s  €  (a,  6),  do  not  exist  in  the  classical  sense,  and  are  often  referred  to  as  singular  integrals. 
Definition  3.1  Suppose  that  ip  is  a  function  [a,  6]  R,  s  €  (a,  b),  and  the  limit 


fs~€  f{i) 

a  t  —  S 


dt+  ["  £21 a 

Js+e  t  -  S 


exists  and  is  finite.  Then  we  will  denote  the  limit  (122)  by 

„  „  fb  *>(*)  a* 


P.v./pU, 


and  refer  to  it  as  a  principal  value  integral. 

Theorem  3.1  Suppose  that  the  function  p  :  [a,  6]  ->  ]R  is  continuously  differentiable  in  a 
neighborhood  of  s  €  (a,  6).  Then  the  principal  value  integral  (123)  exists. 
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3.2  Finite  Part  Integrals 

In  this  paper,  we  will  be  dealing  with  integrals  of  the  form 


<-> 

where  s  e  ( a.b ),  which  are  divergent  in  the  classical  sense.  This  type  of  integrals  are  often 
referred  to  as  hypersingular  or  strongly  singular. 


Definition  3.2  Suppose  that  p>  is  a  function  [a,  b]  -4  1R,  s  €  (o,  b),  and  the  limit 

s>(ffSW  — * 


> 


(125) 


exists  and  is  finite.  Then  we  will  denote  the  limit  (125)  by 


{t  -  s)2 


dt , 


and  refer  to  it  as  a  finite  part  integral  (see,  for  example,  [7]). 


(126) 


The  following  obvious  theorem  provides  sufficient  conditions  for  the  existence  of  the  fi¬ 
nite  part  integral  (125),  and  establishes  a  connection  between  finite  part  and  principal  value 
integrals. 


Theorem  3.2  Suppose  that  the  function  ip  :  [a,  b]  — >  1R  is  twice  continuously  differentiable  in 
a  neighborhood  of  s  G  (a,  b).  Then  the  finite  part  integral  (126)  exists,  and 


(127) 


3.3  The  Hilbert  Transform 

For  an  arbitrary  periodic  function  ip  e  L2[-tt,  n]  and  any  integer  k ,  we  will  denote  by  (pk  the 
k-th  Fourier  coefficient  of  ip,  defined  by  the  formula, 


(128) 

so  that 

00 

¥>(*)  =  £  <Pk  eikt , 

(129) 

A;=— oo 

for  all  t  €  [ — 7r,  7r] . 
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Definition  3.3  The  Hilbert  transform  is  the  mapping  H  :  L2[- tt,  tt]  L2[-k.tt),  given  by  the 

formula 


H(ip)(s)=  ^  -i  sgn(k)  ipk  eiks , 


=  —  oo 
k*0 


(130) 


with  ^  G  L2[-7t,7t}  an  arbitrary  function.  The  function  H(<p)  :  [-tt,  tt]  ->  C  is  often  referred 
to  as  the  conjugate  function  of  ip. 

The  following  theorem  summarizes  several  well-known  properties  of  the  Hilbert  transform 
(see,  for  example,  [9]). 

Theorem  3.3  (a)  The  mapping  H  :  L2[— it,  7r]  — *  L2 ■ — tt,  tt]  is  bounded. 

(b)  For  any  integrable  ip,  the  identity 


%l|,)=p4£i44’ 


holds  almost  everywhere. 

(c)  For  any  function  ip  G  cx[ — 7r,  7t], 


(131) 


W)W  =  ((-ffM)')  w  =  £)  \k\neik‘ 


k  —  —  c 
k^  0 


In  other  words, 


HD  =  DH , 


(132) 


(133) 


where  D  —  is  the  differentiation  operator. 

3.4  Boundary  Integral  Operators 

In  this  subsection,  we  define  boundary  the  integral  operators  Klfi  K2’0  K3>°  ft'0.1  ft'U  1/2,1 
at’  >  Ki  >  ’  that  are  closely  related  to  the  operators  (31)  -  (48)  defined  in  Section  2. 

Definition  3.4  Suppose  that  the  function  a  :  [0,L]  ^  TR  is  sufficiently  smooth.  Then  we 
denote  by  K^,K^  :  c[0,X]  ->  c[0,X]  and  K2’0,  K3’0,  K^,K2>\  K°’2,K}>2,  K^3  :  ^[0,1]  -> 
c[0,  L\  the  operators  defined  by  the  formulae 


dN(t)  a^dt' 

(134) 

52^7(t)(T(s)) 

51V(t)2  ’ 

(135) 

«3*7(.)(7W)  .... 
dN(t) 3 

(136) 
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^'V)(s)  =  l  m.) 

(137) 

*7V)(5)  =  t.P.f  dN$gm 

(138) 

=  f.P./o  mymtp°wdt. 

(139) 

=  f.P./o  a^'(s)2  c(t)dt. 

(140) 

(141) 

if,' MW  =  LP.f  8yw,  c(t)dt . 

(142) 

respectively. 

Remark  3.4  Obviously,  the  operators  if®*1,  K®’2,  K)f’3,  K1,2  given  by  the  formulae  (137), 
(140)  -  (142)  are  the  adjoints  of  the  operators  K}f’°,  K2’0,  K3’0,  K2’1  defined  by  (134)  ~  (136), 
(139).  Furthermore,  K^1,  defined  by  (138),  is  self-adjoint. 

4  Proof  of  Results 

In  this  section  we  prove  the  results  from  Section  2.  The  outline  of  this  section  is  as  follows: 
First,  we  consider  the  case  where  7  is  a  circle.  We  provide  the  proof  for  Theorem  2.6.  In 
Lemma  4.2  we  give  explicit  formulas  for  the  boundary  integral  operators  (134)  -  (140)  for  the 
case  where  7  is  a  circle.  Then,  by  combining  Theorem  2.6  and  Lemma  4.2,  we  immediately 
get  the  so-called  jump  conditions  for  the  operators  (12)  -  (25)  on  a  circle.  These  are  stated  in 
Theorem  4.3. 

Next,  we  consider  the  case  where  7  is  an  arbitrary  and  sufficiently  smooth  Jordan  curve. 
Since  the  proof  of  the  identities  (94)  —  (99)  in  Theorem  2.8  are  similar,  we  only  provide  the 
proof  for  (94)  and  (95).  In  fact,  (94)  and  (95)  in  Theorem  2.8  follow  immediately  from  Theorem 
4.7  and  Lemma  4.6.  The  proof  of  Theorem  4.7  is  based  on  Theorem  4.3  and  the  approximation 
(178)  given  in  Lemma  4.5. 


Proof  of  Theorem  2.6  Since  the  proofs  for  the  identities  (50)  -  (64)  are  nearly  identical,  we 
only  provide  the  proof  for  the  interior  limit  of  the  quadruple  layer  potential  (53) .  Further,  it 
is  enough  to  prove  (53)  for  the  case  r  =  1;  the  general  case  follows  by  a  simple  transformation 
of  variables.  We  choose  the  parametrization 

7 (t)  =  (cos (t) ,  sin(t) ) ,  (143) 
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where  t  6  [— 7r,  7r],  It  immediately  follows  from  (143)  that 

r  ^2^7(t)(7 (s)-h-N(s))  ik 
J-n  dN(t) 2  6  ~ 

=  r  l  -  2  •  (1  -  h)  •  cos (t  -s)  +  (  1  -  Kf_  •  cos  (2  (t  -  a))  ikt 
J-v  (1  +  (1  —  h )2  —  2  •  (1  —  h)  •  cos (t  —  s))2 

=  eiks  ■  r  1-2-(1-ft)-c°sft)  +  (l-^)2-cos(2t)  ifct 
J-ir  (1  +  (1  -  h)2  —  2  •  (1  -  h)  ■  cos(t))2 

for  any  s  €  [-7r,7rj.  We  will  use  calculus  of  residues  to  evaluate  the  integral  (144).  To  this 
effect,  the  substitution 


eiktdt, 


z  =  eu 


converts  (144)  into 

eiks .  r  Izh 

J-i r  (1  - 


1  —  2  •  (1  —  h)  ■  cos (t)  +  (1  —  h )2  •  cos(2 1) 


eikt  dt  = 


iks  #  f  _ 2, 

J\z\=l  Z 


(1  +  (1  -  h)2  -  2  •  (1  -  h)  •  cos(tf))2 

_  gifcs  f  ^  ~  (1  ~  ^)  (^  +  Z-1)  +  ^  (1  ~  ^)2  (z2  +  Z~2) 

4|=1  2  V  (1  +  (1  -  h)Z  -  (1  -  h)  (z  +  z-'))2 
and  after  simple  algebraic  manipulation,  we  get 

-i  (til l-^)(^  +  2-1)  +  l(l-M2U2  +  z-2)\  k 


■zkdz,  (146) 


-(1- 

h)  (; z  +  z~l))‘ 

( 

izk+1 

cm 

-h)-  z)2 

-  eikt 

dt  = 

izk+1 

-h)-  z )2 

Substituting  (147)  into  (146),  we  obtain 

r  ik 

J-n  dN(t) 2 

_  ifci  f  1  (  tZfc_1  \ 

7w=i2'l  «l_A)-2)2  (z(l-ft)-l)2J  (148) 

Now,  formula  (53)  for  r  =  1  follows  by  applying  a  standard  residue  calculation  to  (148).  □ 

Remark  4.1  Formulae  (50)  -  (52),  (57)  -  (58)  follow  from  well-known  results  (see  for  example 
[11,  3]).  While  the  derivation  of  (53)  -  (56),  (59)  -  (64)  is  quite  similar,  the  authors  failed  to 
find  them  in  the  literature. 

The  operators  K^\  K2>°,  K*’°,  K, K2/,  K^\  K°>2,  K?>\  K^2  defined  by  (134)  -  (141), 
assume  a  particularly  simple  form  on  the  circle.  The  following  lemma  follows  immediately  from 
an  elementary  computation. 
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Lemma  4.2  Suppose  that  7  is  a  circle  of  radius  r  parametrized  by  its  arclength  with  exterior 
unit  normal  denoted  by  N.  Then,  for  any  sufficiently  smooth  function  0  :  [— nr,  7rr]  -*•  (D: 


(a) 

(b) 


(c) 

(d) 

(e) 

(f) 

(9) 


(h) 

(i) 


where  H  denotes 


)W 

4‘ww 

Kfww 

K°‘2(rr)ls) 

K\2(tj)ls) 

*?»(<) 


£-#*—». 


r2  2r2  cos(^-r -)  —  2r2 


7rr  1  cto  +  -kH{g')(s)  , 

f  r  (-— _ 3 _ 

^  J —ttt  y  r3  2r3  cos(^)  —  2r3 
— 2  7rr-2  a0  —  37rr-1  H{a')(s) , 
fnr  o  (t) 

a(t) 


a(t )  dt 


a(t )  dt 


at  2 r2  —  2 r2  cos(^) 

a(t) 


dt=  -nH(a')(s), 


f*r  au\ 

f'P'  Lr  2r'3  cos(^)  —  2r3  *  =  "  *M<S>  ’ 

f'P'  L„  (2^  +  2  r2  cos(^)  —  2  r2  )  ff(t)  * 
7rr-1  <?o  +  tt  H(cr')(s) , 

r*r  GU\ 

fp'  /-„  273cos(£7)-2r3  *  = ,rr'1  *(</)w  • 

-£(- i- 


2r3  cos(^)  —  2r3 
^jrr^ffo-Swr^fT^Ka), 


cr(t)  dt 


the  Hilbert  transform  (see  (130)  in  Section  3.3). 


(149) 

(150) 

(151) 

(152) 

(153) 

(154) 

(155) 

(156) 

(157) 


The  following  theorem  is  an  immediate  consequence  of  Theorem  2.6  and  Lemma  4.2.  It 
summarizes  the  so-called  jump  conditions  for  the  integrals  (12)  -  (29)  on  the  boundary  T, 
where  F  is  a  circle. 


Theorem  4.3  Suppose  that  7  is  a  circle  of  radius  r  parametrized  by  its  arclength  with  exterior 
unit  normal  denoted  by  N.  Further,  suppose  that  H  denotes  the  Hilbert  transform  (130).  Then, 
for  any  sufficiently  smooth  function  a  :  [— 7rr,  7rr]  — *■  <C, 

(a)  #5,’? (*)(*)  =  -Jra(«)+.K}'0(<r)(«),  (158) 
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=  7ra(s)  +  ^-°(CT)(s),  (159) 

Q>)  K^)(s)  =  7rr~1a{s)  +  K2’°(a){s),  (160) 

Ky°e(°){s)  =  — 7rr_1  <r(s)  +  K2,0(cr)(s) ,  (161) 

(c)  K*?i  (<0(«)  =  -2nr-2a{s)  +  7Ta"($)  +  K*’°{a){s),  (162) 

Kyfe(a)(s)  =  27rr~2a{s)  -  ncr"(s)  +  K*’°{cr)(s) ,  (163) 

(d)  Xj,’i(a)(s)  =  7ra(s)  +  (^’0)*(a)(5),  (164) 

K°i(°){s)  =  -Tra(s)  +  {K}ffi)*{a)(s),  (165) 

(e)  X};1  (*)(«)  =  K}f^(a)(s)=K}t’1(a){s)  =  -TTH(a,)(s),  (166) 

(f)  K^{a){s)  =  —-kc"(s)  +  K2,1(<j)(s)  ,  (167) 

K%’\(<r)(*)  =  7ra"( a) +if*'1(a)(s ),  (168) 

(s)  K7?M)is)  =  — 7rr_1  a(s)  +  K®'2(cr)(s) ,  (169) 

=  Trr-1a{s)+K°’2(a)(s),  (170) 

(h)  X};?(<r)(«)  =  *0f,{a)  +  {K%1)*(cr)(ks),  (171) 

'Kr,’e(o‘)(«>  =  -7ra,,(5)  +  (j£:2-1)*(a)(s),  (172) 

(i)  <1(a)(S)  =  2irr~2a(s)  -  n a"(s)  +  (K*’°)*(a)(s) ,  (173) 

Kjfe(a)(s)  =  ~27rr~2a(s)  +  ircr"{s)  +  (K*’°)*(a)(s) .  (174) 


We  now  proceed  to  the  case  where  7  is  an  arbitrary  sufficiently  smooth  Jordan  curve.  The 
following  obvious  lemma  can  be  found  in  most  elementary  textbooks  on  differential  geometry 
(see,  for  example,  [4]). 

Lemma  4.4  Suppose  that  7  :  [0,  L]  — >■  1R2  is  a  sufficiently  smooth  Jordan  curve  parametrized 
by  its  arclength  with  the  exterior  unit  normal  and  the  unit  tangent  vectors  at  7(5)  denoted  by 
N(s)  and  T(s),  respectively.  Then,  there  exist  a  positive  real  number  a  (dependent  on  y),  and 
two  continuously  differentiable  functions  f,g  :  (—a,  a)  — >  ]R  (dependent  on  7 ).  such  that  for 
any  s  E  [0,  L], 

1(3  + 1)  -  7(S)  =  (t  + t3  ■  /(«))  •  T(3)  -(jY+t3-  m)  ■ N(S ) ,  (175) 

for  all  t  G  (-a,  a),  where  the  coefficient  c  in  (175)  is  the  curvature  of  7  at  the  point  7 (s). 
Furthermore,  for  all  t  €  (—a,  a), 

i/wi<nvm 

bWI<ll'/"(5)||. 
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(176) 

(177) 


In  the  local  parametrization  (175),  the  potential  of  a  quadrupole  located  at  7 (s)  and  oriented 
in  the  direction  N(s)  assumes  a  particularly  simple  form,  given  by  the  following  lemma. 

Lemma  4.5  Suppose  that  7  :  [0,  L)  -¥  1R2  is  a  sufficiently  smooth  Jordan  curve  parametrized 
by  its  arclength.  Then,  there  exist  real  positive  numbers  A ,  a  and  ho  such  that  for  any  s  £  [0,  L] 

^2^y{s+t)(j(s)  —  h  ■  N(s))  h2-t 2  cht2  (5h2  + 12)  ^  A  ,  s 

dN(s  + 1)2  ( h2  +  t 2f  (h2  +  t2)3  ~  ’  (178) 

for  all  t  e  (~a,  a),  0  <  h  <  ho,  where  the  coefficient  c  in  (178)  is  the  curvature  of  7  at  the 
point  j{s). 

Proof  Without  loss  of  generality,  it  is  sufficient  to  prove  the  lemma  for  the  case  where 
s  =  0,  7(0)  =  0,  and  7'(0)  =  (1,0).  Substituting  (175)  into  (9)  and  evaluating  the  result  at 
x  =  (0,  h),  we  obtain 


=  po(M ) 

8N(t)2  (h2+t2  +  r(M))2  ’ 

where  po,r  :  IR2  — >  IR  are  functions  given  by  the  formulae 


(179) 


Po(h,t)  = 

ctA 


h  —  t  -f-  cht  + 


ct2  c2t3 


+  3  ht2  (. f(t )  +  g(t))  -  2 13  (2  f(t)  -  g(t)) 


~(f(t)  +  S9(t))+ht3(f'(t)+g'(t)) -tA(f'(t)-g'(t))-Zt5(f(t)2+g(t)2) 
-C~Y (/'W  +  9'(t))  ~  t 6  f(t) (f'(t)  -  g'(t))  -  t 6 g(t) (f(t)  +  g'(t)) 


c  t2  c2 13 

h  +  t  —  cht  +  — —  — - — h  : 


2  •  2  ■  Zht2(f(t)-g(t))+2t3(2f(t)+g(t)) 

-^(m-bgitj)  +ht3(f'(t)-g,(t))  +t4(f,(t)+g'(t))+3t5(f(t)2+g(t)2) 


ct 4 
2 

ct 5 


~  9'(t))  +t6  f(t)[f'(t)  +  g'(t))  -t6 g(t)(f'(t)  -  g'(t)) 


(180) 


(3. 

r(h,t )  =  -cht2  -  2ht3 g{t)  +  + 

We  also  introduce  the  notation 


2 14  ft)  +  ct 5 git)  +  t 6 (. fit )2  +  g{t)2)  .  (181) 


Pi(h,t)  =  (h2  +  t2  +  r(h,t)Y  -  (h2  +  t2Y  =  2  (h2  + 12)  ■  r(h,t)  +  r(h,t)2  .  (182) 

Obviously,  (180)  -  (182)  are  algebraic  combinations  of  /,  g,  f,  g1,  t,  and  h,  and  an  examination 
of  formulae  (180)  -  (182)  immediately  shows  that  there  exist  positive  real  numbers  a,  ho,  and 
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C  (dependent  on  7)  such  that 


| po {h, t)  -  h2  +  t2  -  3  cht2 
Po(h,t)  -pi(h,t)  -2 cht2  (h2  +  t2)  ( h 2  -  t 2) 

|j>o(M)  -pi(h,t)2 
pi(M) 

{h2  + f2)2 


< 

C{h2+t2)2, 

(183) 

< 

C(h2+t2)4, 

(184) 

< 

C(h2+t2f, 

(185) 

1  < 

1, 

(186) 

for  all  h  <  ho,  t  E  (—a,  a).  Substituting  (182)  into  (179),  we  have 


_  Po(M) 

dN(t)*  (A2+t*)J(  1  +  jg^r) 

_  Po(^) 

(h2  +  t2)2^  J  (h2+t2)2*’ 


(187) 


where  the  convergence  of  the  series  follows  from  (186).  Combining  (183)  -  (185),  we  obtain 


d2$7(t)(x) 

h2-t2 

cht2  (bh2  + 12) 

Po{h,  t)  —  h2  +  t2  —  3  cht2 

dN(t )2 

\h2+t2)2 

(h2  + t2)3 

> 

(j h2  +  t 2)2 

Po(h,t)  -pi{h,t)  -2 cht2  (h2  +  t2)  (h2  -t2) 

00 

.  XT 

Po(M)  -Pi(h,t)k 

{h2  +  t2f 

+  L, 

2 

{h2  +  t2)2k+2 

<  2  C  +  C--^-, 
1  —  a. 


(188) 


with  a  defined  by  the  formula 


Ot  =  sup 

h<ho  ,  t£(— a, a) 


pi(M) 
(h2  + t2)2 


(189) 


Now,  introducing  the  notation 


.A  =  2C  +  C  • 


or 


1  —  a 


(190) 


we  obtain  (178).  □ 

Lemma  4.2  provides  an  explicit  formula  for  the  operator  K2’0,  defined  in  (135),  in  the 
case  when  7  is  a  circle.  The  following  lemma  shows  that  the  operator  K2'0  on  an  arbitrary 
sufficiently  smooth  Jordan  curve  of  length  L,  is  a  compact  perturbation  of  K2,0  on  the  circle 
of  radius  Its  proof  is  an  immediate  consequence  of  estimate  (178)  in  Lemma  4.5. 


Lemma  4.6  Suppose  that  7  :  [0,  L]  — >■  H2  is  a  sufficiently  smooth  Jordan  curve  parametrized 
by  its  arclength,  and  that  77  :  [0,  L]  — >  IR.2  denotes  the  circle  of  radius  ,  also  parametrized 
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by  its  arclength.  In  addition,  suppose  that  o :  [0,  L]  ->  It  is  a  twice  continuously  differentiable 
function.  Then, 


92*,m(7M) 

dNIlj2 


a{t)  dt 


d2^)  (77(a)) 
dN(t) 2 


cr(t )  dt  +  M2(a)(s) , 


where  M2  ■  c[0,  L]  -4  c[0,  L]  is  a  compact  operator  defined  by  the  formula 


M2(a){s)  =  f 
Jo 


1  dN(t)2 


SN(t)2  ) 


a(t)  dt . 


Furthermore ,  for  any  t  ^  s, 


(191) 


(192) 


m2(s,t) 


a#: tVY?  _  1  (2* r\2 
117(a)  -7(t)  II4  2  [lJ 

|  117(a)  -  7(t)|[2  -  2  (^)2  (l  -  cos  (Zfjs  -  t))) 
117(a)  -7WII2  2  (^)2(l~  cos  (^(s-i))) 


(193) 


and  for  t  =  s, 

m2{s,s)  =  ±(c{s)  )2-A(^l)2,  (194) 

w/iere  c(s)  is  tAe  curvature  of  7  at  the  point  7 (s),  and  m2  :  [0,L]  x  [0,Zj  -4  ]R  is  t/ie  jfcernei  0/ 
the  operator  M2. 


The  following  theorem  provides  the  so-called  jump  conditions  for  the  operators  (14)  and 
(15)  on  the  boundary  T,  when  T  is  sufficiently  smooth. 

Theorem  4.7  Suppose  that  7  •  [0.  L]  — >  1R2  is  a  sufficiently  smooth  Jordan  curve  parametrized 
by  its  arclength.  Then,  for  any  sufficiently  smooth  function  a  :  [0,  L]  -4  1R, 

K^(*)(s)  -  = 

-  fL  +  h  •  N(s))  02$7{t)(7(a)  -  h  •  N(s)) 

h-*oJo  ^  dN(t)2  dN(t)2 

=  -2nc{s)a(s),  (195) 

and 


W  +  K™(<r)W  = 


=  lim  f1 

h—¥ 0  Jo 

=  2-f.p.  f 
Jo 


( g2$7(t)(7 (g)  +  h  •  N{s))  |  a2$7(t)(7(s)  -  h  ■  N(s)) 


V  dN(t)2 

L  (a)) 

dN(t) 2 


aiv(t)2 


cr(t)  dt 


a(t)dt, 


(196) 
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where  c(s)  denotes  the  curvature  of  7  at  7(5).  In  other  words,  the  quadruple  layer  potential 
with  density  a  ( see  (6)),  can  be  continuously  extended  from  Q  to  Q  and  from  ]R2  \U  to  IR2  \Q, 
with  the  limiting  values  given  by  the  formulae 


P 


2,0 

7,o-,e 


M 


<!ww  = 


7 r 


(197) 

(198) 


Proof.  Without  loss  of  generality,  we  can  assume  that  5^0  and  s  ^  L.  We  begin  by 
proving  (196).  Suppose  that  r\  :  [0,L]  -4  IR2  is  the  circle  of  radius  parametrized  by  its 
arclength.  We  define  the  functions  E*,  E*  :  [0,  L]  x  [0,  L] ->  IR  via  the  formulae 


vh(*t)  -  d%(t)(7(s)  +  h-N(s))  d2^t)(7(s)-h-N(s)) 

7i,)  MW2  - - ’  (199) 

t)  =  +  fr-  N(s))  ^^(rjjs)  -h^Njs)) 

v  ’  '  dN{t)2  dN{t)2  ’  (200' 

and,  substituting  (199),  (200)  into  (196),  obtain  the  identity 


K*fe (a) (s)  +  Kyfi (cr) (s)  =  lim  f  Z$(s,t)a(t)dt+Yimf  (S!}(s,t)-Ej(s,t))  a{t)dt.  (201) 

Substituting  (160),  (161)  in  Theorem  4.3  into  (201),  we  have 
K2%{a){s)  +  K2'°{c){s)  = 


=  2-f.p.  f1 
Jo 


^(t)  (*?(*)) 

dN(t) 2 


a(t)  dt  +  lim  f  (s*(s,f)  -  Ej(s,i))  a{t)  dt .  (202) 


Due  to  Lemma  4.5,  there  exist  positive  real  constants  C0,  a,  and  h0  such  that  for  any  s  £  [0,  L) 

S?M)  -Ej(s,t)|  <  Co,  (203) 


for  all  1 1  -  s|  <  a,  0  <  h  <  h0.  For  any  t^  s  and  sufficiently  small  h,  both  E^(s,  t)  and  E  h(s,  t ) 
are  c°°-functions.  Therefore,  there  also  exist  positive  real  constants  hi,  C\  such  that  for  anv 
s€[0,L]  3 

E^-Sjt)  —  E^(s,t)|  <  C\ ,  (204) 

for  all  1 1  -  s\  >  a,  0  <  h  <  hi.  Now,  applying  Lebesgue’s  dominated  convergence  theorem  (see, 
for  example,  [18])  to  the  second  integral  of  the  right  hand  side  of  (202),  we  obtain 


lim 
/i— *o 


s£(M))  cr(t)dt  = 


=  f0  Km0(E?(M)-Ej(s,f))a(i)df 

_  o  fL  ( (?(*))  d^v{t)(r,(s))\ 

Jo  \  dN(t) 2  dN{t) 2  ) 


a(t)  dt . 


(205) 
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Finally,  formula  (196)  immediately  follows  from  the  combination  of  (202),  (205)  with  (191), 
(192)  in  Lemma  4.6. 

We  now  proceed  by  proving  formula  (195).  We  define  the  functions  A*,  A£  :  [0,  L]  x  [0,  L)  -* 
IR  via  the  formulae 


A  _  52$7(t)(T(s)  +h-N{s))  d2$7(t)(7(s)  -h-  N(s)) 

}  MW2  mW - ’ 

_  +  h  mN{s))  ~  d2$rf(t)(q(s)  -  h  •  N(s)) 


(206) 

(207) 


A,M  "  MW  9NW 

and,  by  substituting  (206),  (207)  into  (195),  obtain  the  identity 
*?»(•)  -  K%(a)(s)  = 

=  '  a s:  A5(s-  *>  "W dt + a  r  (A‘(Si  4)  -  tt  •  AS(s-  *>) ff(<)  *  - 

(208) 

Substituting  (160),  (161)  in  Theorem  4.3  into  (208),  we  get 

=  -2 ir c(s)a(s)  +  lim  -  ^11  •  Aj(s,f>)  <r(e)  di .  (209) 

Due  to  Lemma  4.5,  there  exist  positive  real  constants  Co,  a,  and  h0  such  that  for  any  s  €  [0,  L] 


a?M-^-aJ(m) 


<0>, 


(210) 


for  all  1 1 - s|  <  a,  0  <  h  <  ho-  For  any  t  ^  s  and  sufficiently  small  h,  both  A^(s,  t)  and  A*(s,  t) 
are  c°°-functions.  Therefore,  there  also  exist  positive  real  constants  hi,  Ci  such  that  for  anv 
s  €  [0,Zi] 


<CU 


(211) 


for  all  1 1  s|  >  a,  0  <  h  <  h\.  Applying  Lebesgue’s  dominated  convergence  theorem  (see,  for 
example,  [18])  to  the  second  integral  of  the  right  hand  side  of  (209),  we  have 


a  f  (A?<*’ ^  ())  dt = £  a  (a>.  o 

Examining  (206),  (207),  we  obviously  have 


Aj(s,t))a(t)dt. 

(212) 


lim  (A  >(s,t)  -  ^  •  A  {(a,'*))  =0.  (213) 

Therefore,  the  integral  on  the  right  hand  side  of  (212)  is  zero,  from  which  (195)  follows  imme¬ 
diately.  □ 
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5  Generalizations 

We  have  presented  explicit  (modulo  an  integral  operator  with  a  smooth  kernel)  formulae  for 
integro-pseudodifferential  operators  of  potential  theory  in  two  dimensions  (up  to  order  2).  The 
work  presented  here  admits  several  obvious  extensions. 

a.  Formulae  (89)  -  (107)  have  their  counterparts  for  elliptic  PDEs  other  than  the  Laplace 
equation.  Indeed,  for  any  elliptic  PDE  in  two  dimensions,  the  Green’s  formula  has  the  form 

G{x,  y )  =  <p(x,  y)  •  log(||x  -  y||)  +  t/>( x ,  y) ,  (214) 

with  (f>,  ip  a  pair  of  smooth  functions;  derivations  of  Section  4  are  almost  unchanged  when 
log(|[m  —  y||)  is  replaced  with  (214).  In  particular,  the  counterparts  of  the  formulae  (89)  -  (99) 
for  the  Helmholtz  equation  (with  either  real  or  complex  Helmholtz  coefficient)  are  identical  to 
(89)  -  (99);  the  counterparts  of  the  formulae  (100)  -  (107)  for  the  Helmholtz  equation  do  not 
coincide  with  (100)  -  (107)  exactly;  instead,  they  assume  the  form 


(a) 

*Sww 

=  — 27r(c(s))  <j(s)  +  Att k2  a(s)  +  na"(s)  —  2ttc'(s)  H(a)(s) 

-37rc(S)H(a')(s)  +  N3(a)(s), 

(215) 

=  2-k  (c{s)Y  a(s) -4nk2a(s)  -7 r/(s)  -  2nc'(s)  H(<r)(s) 

-3nc(8)H{o,)(s)  +  Na(*)(8)i 

(216) 

(b) 

<>)M 

=  —4ir  k2  a(s)  —  n  cr"(s)  +  7r  c'(s)  JI(a)(s)  +  irc(s)  ff(a')(s) 

+G3  (&)($) , 

=  47rk2a(s)  +  na"(s)  +  nc'(s) H(<r)(s)  +1 r c(s)  H(o')(s) 

(217) 

+G3{a)(s) , 

(218) 

(c) 

0)M 

=  4 7T  k2  a(s)  +  7T  a,7(s)  +  tt  c(s)  J?((t')(s)  +  G3(a){s) , 

(219) 

Kl^ajU) 

=  -47 rk2a(s)  -  7ra"(s)  +  7rc(s)  H(a')(s)  +  G3(a)(s) , 

(220) 

(d) 

=  2n(c(s)'j  cr{s)  —  4n  k2  a(s)  -  no"(s)  —  tt c'(s)  H(cr)(s) 

-Z*c(s)H(a')(s)  +  N3(a)(s), 

(221) 

=  —  27r(c(s)j  cr(s) +47rA:2c7(s) +  7ra"(s)  —  7rc'(s)i3'((7)(s) 

— 37t  c(s)  H(a')(s)  +  N3(cr)(s) , 

(222) 

where  k  €  C  is  the  Helmholtz  coefficient,  and  the  operators  JV3,  G3,  N3,  G3  :  L2[ 0,  L)  -*  L2 [0,  L) 
are  compact. 

b.  The  derivation  of  the  three-dimensional  counterparts  of  formulae  (89)  -  (107)  is  completely 
straightforward;  such  expressions  have  been  obtained,  and  the  paper  reporting  them  is  in 
preparation. 
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c.  In  certain  areas  of  mathematical  physics,  one  encounters  integro-pseudodifferental  equations 
whose  analysis  is  outside  the  scope  of  this  paper.  An  important  example  is  the  Stratton-Chew 
equations,  to  which  Maxwell’s  equations  are  frequently  reduced  in  computational  electromag¬ 
netics.  Another  source  of  such  problems  is  the  scattering  of  elastic  waves  in  solids.  Problems 
of  this  type  are  currently  under  investigation. 
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We  present  a  procedure  for  the  design  of  high  order  quadrature  rules  for  the  numerical 
evaluation  of  singular  and  hypersingular  integrals;  such  integrals  are  frequently  encountered 
in  solution  of  integral  equations  of  potential  theory  in  two  dimensions.  Unlike  integrals  of 
both  smooth  and  weakly  singular  functions,  hypersingular  integrals  are  pseudo-differential 
operators,  being  limits  of  certain  integrals;  as  a  result,  standard  quadrature  formulae  fail 
for  hypersingular  integrals.  On  the  other  hand,  such  expressions  are  often  encountered 
in  mathematical  physics  (see,  for  example,  [11]),  and  it  is  desirable  to  have  simple  and 
efficient  “quadrature”  formulae  for  them.  The  algorithm  we  present  constructs  high-order 
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scheme  is  the  fact  that  each  of  the  quadratures  it  produces  can  be  used  simultaneously  for  the 
efficient  evaluation  of  hypersingular  integrals,  Hilbert  transforms,  and  integrals  involving 
both  smooth  and  logarithmically  singular  functions;  this  results  in  significantly  simplified 

implementations.  The  performance  of  the  procedure  is  illustrated  with  several  numerical 
examples. 
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1  Introduction 

Numerical  integration  is  one  of  most  frequently  encountered  computational  procedures. 
When  smooth  functions  are  to  be  integrated,  classical  techniques  tend  to  be  adequate, 
especially  in  one  and  two  dimensions;  one  of  most  efficient  general-purpose  tools  consists  of 
various  versions  of  nested  Gaussian  quadrature  rules  (see,  for  example.  [20.  18.  3.  6]).  In 
cases  where  extremely  efficient  special-purpose  quadratures  are  warranted,  Gaussian  (and 
more  recently,  Generalized  Gaussian)  quadratures  are  the  approach  of  choice. 

When  singular  functions  are  to  be  integrated,  the  situation  tends  to  be  less  satisfactory. 
Special-purpose  Gaussian  quadratures  can  be  easily  constructed  for  functions  of  the  form 

f{x)  =  s(x)  ■  4>{x),  (!) 

where  s  is  a  fixed  singular  function,  and  4>  is  smooth.  On  the  other  hand,  such  situations 
are  relatively  rare;  much  more  frequently,  one  is  confronted  with  integrands  of  the  form 


f(x)  =  s(x)  -4>(x)  +1p{x) 


(2) 


where  s  is  a  fixed  singular  function,  and  4>  and  t/>  are  two  distinct  smooth  functions  (often, 
several  different  singularities  are  involved).  Here,  Gaussian  quadratures  can  not  be  used 
directly,  and  during  the  last  several  years,  Generalized  Gaussian  quadratures  have  been 
developed  as  a  tool  (in  part)  for  dealing  with  such  situations. 

The  situation  is  further  complicated  when  (as  frequently  happens  in  potential  theory) 

the  ‘•integrals’'  to  be  evaluated  are  not,  strictly  speaking,  integrals,  but  involve  expressions 
of  the  form 


r  m.  *, 

J-iy-x 

f1  <K*)  , 

J-i  (y  -  x)7  dx' 


f1 

J- 1  (y- 


<t>{x)_ 
x) 


3  dx, 


(3) 

(4) 

(5) 


etc.,  understood  in  the  appropriate  finite  part  sense  (in  the  engineering  literature,  (4)  is 
often  referred  to  as  the  “hypersingular”  integral).  Normally,  “integrals”  (3)  -  (5)  (and  sim¬ 
ilar  objects)  are  treated  via  special-purpose  techniques  (product  integration,  interpolatory 
quadratures,  etc.).  A  drawback  of  this  approach  is  the  need  to  separate  singularities  of 
different  types,  so  that  each  can  be  treated  via  an  appropriate  procedure.  For  example,  in 
(2)  ,  one  would  need  to  have  access  to  each  of  the  functions  <j>,  rf>  individually,  as  opposed  to 

emg  able  to  evaluate  the  functions  in  toto  (the  latter  situation  is  frequently  encountered 
m  practice). 

(£  In  tlus  paper’  we  desi§n  a  collection  of  algorithms  for  the  construction  of  high-order 

quadratures”  for  the  evaluation  of  hypersingular  integrals.  The  additional  advantage  of 
the  scheme  is  the  fact  that  each  of  the  quadratures  it  produces  can  be  used  simultaneously  for 
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the  efficient  evaluation  of  hypersingular  integrals.  Hilbert  transforms,  and  integrals  involving 
both  smooth  and  logarithmically  singular  functions;  this  results  in  significantly  simplified 
implementations. 

Remark  1.1  Unlike  the  quadratures  for  functions  of  the  form  (2),  the  quadratures  con¬ 
structed  in  this  paper  are  not  convergent  in  the  classical  sense.  Instead,  they  produce  a 
prescribed  accuracy  for  a  prescribed  set  of  functions,  such  as  Legendre  polynomials,  of  all 
orders  no  greater  than  some  natural  number  n,  Legendre  polynomials  multiplied  by  log¬ 
arithms,  etc.  Due  to  the  triangle  inequality,  it  is  easy  to  estimate  the  precision  produced 
when  such  quadratures  are  applied  to  linear  combinations  of  Legendre  polynomials,  Legendre 
polynomials  multiplied  by  logarithms,  etc.  Finally,  we  observe  that  if  the  chosen  accuracy  is 
sufficiently  small  ( such  as  the  machine  precision),  the  behavior  of  the  resulting  quadratures 
ts  indistinguishable  from  rapid  convergence  (as  can  be  seen  from,  for  example,  Figures  2  - 
3  in  this  paper). 

Remark  1.2  During  the  last  two  decades,  numerical  techniques  have  been  developed  in  the 
computational  potential  theory  (especially,  for  the  Helmholtz  equation  and  related  problems 
involving  time-domain  Maxwell’s  equations)  that  replace  classical  integral  equations  with 
combined  integro-pseudo-differential  equations.  The  reasons  for  these  recent  developments 
are  involved,  and  have  to  do  with  so-called  ‘‘spurious  resonances”  (see,  for  example,  [4. 
15,  16,  19]).  Without  getting  into  the  analytical  details,  we  observe  that  the  interest  in  the 
numerical  solution  of  such  integro-pseudo-differential  equations  is  growing  rapidly,  and  one 
of  principal  motivations  behind  this  work  is  the  design  of  appropriate  rapidly  convergent 
discretization  schemes. 

The  paper  is  organized  as  follows:  In  Section  2,  the  necessary  mathematical  and  nu¬ 
merical  preliminaries  are  introduced.  In  Section  3,  we  develop  numerical  quadratures  for 
integrands  that  are  algebraic  combinations  of  smooth  functions  and  functions  with  singu¬ 
larities  of  the  form  log|x|,  i,  In  Section  4,  we  describe  a  numerical  procedure  for  the 
construction  of  the  quadratures  from  Section  3.2.  Section  5  contains  numerical  examples 
of  some  of  the  quadratures  developed  in  this  paper.  Finally,  in  Section  6  we  briefly  dis¬ 
cuss  extensions  of  results  of  this  paper  to  singularities  other  than  log|x|,  ±,  and  to 
two-dimensional  singular  and  hypersingular  integrals. 


2  Mathematical  and  Numerical  Preliminaries 

In  this  section,  we  summarize  several  results  from  classical  and  numerical  analysis  to  be 
used  in  the  remainder  of  this  paper.  Detailed  references  are  given  in  the  text. 


2.1  Principal  Value  Integrals 


Integrals  of  the  form 


L 


<p(x) 


a  X  -  y 


dx , 


(6) 
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\^here  y  E  (g-,  6),  do  not  exist  in  the  classical  sense,  and  are  often  referred  to  as  singular 
integrals. 


Definition  2.1  Suppose  that  ip  is  a  function  [a,  6]  -+  H,  y  E  (a,  b),  and  the  limit 


H  Jr‘miI+rb 

e->°  \Ja  X-y  Jy+e  x-y 


exists  and  is  finite.  Then  we  will  denote  the  limit  (7)  by 

rb  <p(x) 


>.V.  f 

Ja  X-y 


(7) 


(8) 


and  refer  to  it  as  a  principal  value  integral. 

Theorem  2.1  Suppose  that  the  function  <p  :  [a,  6]  — >  IR  is  continuously  differentiable  in  a 
neighborhood  of  y  6  (a,  6).  Then  the  principal  value  integral  (8)  exists. 

2.2  Finite  Part  Integrals 

In  this  paper,  we  will  be  dealing  with  integrals  of  the  form 

rb  ip{x) 


L 


a  {x-y)2 


dx , 


(9) 


where  y  €  (a,  b),  which  are  divergent  in  the  classical  sense.  This  type  of  integrals  are  often 
referred  to  as  hypersingular  or  strongly  singular. 


Definition  2.2  Suppose  that  ip  is  a  function  [a,  b]  ->  TR,  y  £  {a,  b),  and  the  limit 


lim  (  [V  t^X\2  dx  +  f  t — 

e-*°  \J*  (x  -  y )2  jy+t  (x  - 


Jy+e  ( X  -  y)2 

exists  and  is  finite.  Then  we  will  denote  the  limit  (10)  by 

rb  <f{x) 


(10) 


(ii) 


y)2 

and  refer  to  it  as  a  finite  part  integral  (see,  for  example,  [9]). 

The  following  obvious  theorem  provides  sufficient  conditions  for  the  existence  of  the 
finite  part  integral  (10),  and  establishes  a  connection  between  finite  part  and  principal 
value  integrals. 

Theorem  2.2  Suppose  that  the  function  tp  :  [a,  fc]  — » ]R  is  twice  continuously  differentiable 
m  a  neighborhood  of  y  €  (a,  ft).  Then  the  finite  part  integral  (11)  exists,  and 


[b  V{x)  j  d  fb  (p(x)  , 

1-P-  /  t - ry  dx  =  —  p.v.  /  — 1— h  dx . 

Ja  (x  -y)2  dy  Ja  x-y 


(12) 
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2.3  Legendre  Polynomials  and  Legendre  Expansions 

For  any  natural  number  n.  the  Legendre  differential  equation  is 

/ 1  2  \  d2u  „  du 

(1~x  -2x-fa+n{n  +  l)-u  =  0.  (13) 

One  solution  of  the  Legendre  differential  equation  (13)  is  the  Legendre  polynomial  Pn(x)  : 
[~  1>  1]  1R-,  defined  by  the  three-term  recursion  formula 

Pn+1(x)  =  -X  ■  Pn(x)  -  —  •  P„-!(x) ,  (14) 

with 


Po(x)  =  1, 

-Pl(z)  =  X. 


(15) 

(16) 


As  is  well-known,  the  Legendre  polynomials  have  an  explicit  expression  given  by  the  formula 


Pn{x)  = 


1  cP* 


(x2  -  l)n . 


2n  n!  dxn 

Furthermore,  they  are  orthogonal  with  respect  to  the  inner  product 


(f-,9)  =  /■  f{x)g(x)  dx. 


(17) 


(18) 


Suppose  that  xi,x2,...,xN  denote  the  zeros  of  the  77-th  Legendre  polynomial  PN  :  [-1, 1] 

1R.  Then  we  will  refer  to  the  points  £i, 22, . . . ,  S.v  on  the  interval  [0.6],  defined  by  the 
formula 

b  cl  a  h 

xi  =  —  .Xi  +—,  (19) 
for  all  i  =  1,2,...,  TV,  as  the  N  Legendre  nodes  on  [a,  6], 

For  any  sufficiently  smooth  function  <p  :  [-1, 1]  ->  R  we  will  be  denoting  by  on  the  n-th 
Legendre  coefficient  of  <p,  defined  by  the  formula, 


&n  — 


2  n  +  1 


J [  <p{x)  Pn(x)  dx , 


(20) 


OO 


so  that  for  all  x  €  [— 1, 1] 

‘p(x)  =  Y,anpn(x).  (21) 

n= 0 

The  series  (21)  is  referred  to  as  the  Legendre  expansion  of  ip.  Given  any  natural  number 
N,  for  computational  purposes  we  will  be  approximating  the  Legendre  expansion  (21)  by 
its  truncated  series  of  degree  77  —  1  ' 


N—l 

¥>(*)  ~  S  °n  Pn(x)  ■ 
n=0 


(22) 
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The  following  lemma  states  that  the  truncated  Legendre  expansion  of  degree  N  -  1  (22) 
converges  rapidly  for  sufficiently  smooth  functions,  and  is  proved,  for  example,  in  [7j. 

Lemma  2.3  Suppose  that  :  [—1, 1]  — >  IR  is  k  times  continuously  differentiable  and  that 
Y^%Loan  Pn(x)  denotes  its  Legendre  expansion.  Then,  for  any  point  x  €  [—1. 1], 


i »  —  x 

^  P n(^)  =  0 


Nk  J  ' 


The  following  theorem  relates  the  coefficients  in  a  Legendre  expansion  to  the  coefficients 
in  the  Legendre  expansion  of  its  derivative  and  integral,  respectively.  Its  proof  follows  from 
a  combination  of  results  in  [21,  1,  7,  8]. 

Theorem  2.4  Given  a  natural  number  N,  suppose  that  the  polynomial  p  :  [-1, 1]  ]R  is 
defined  by  the  formula 


Then , 


—  i 

p(x)  'y  ^  an  P rc(x) 


P  (:r)  —  Pn(x)  j 

71=0 


with  the  coefficients  bn  given  by  the  formula 


\  N -f-n  —  3 

«■  2 


fcn  =  (2n  +  l)  JL  02*+i-n,  n  =  0, ...  ,7V  —  2, 

k—n 

and  with  3]  denoting  the  integer  part  of  --+"~3 .  Furthermore, 

fX  ^ 

/  p(y)dy  =  J2c*pn(x)i 

J~1  n= 0 

with  the  coefficients  Cn  ^uen  6j/  the  formulae 


co  =  E(-i)n+1c„, 


/.  =  C"-1 _ Qn+1  .  ..  . 

2  (n  —  1)  +  1  2(n  +  l)  +  l’  n  ~  t  •  •  •  *  ^  “  2 . 


cjv-i  = 


= 


dyv-2 

2  (JV  -  2)  +  1  ’ 

O/V-l 

2  (iV  —  1)  +  l  ' 


Remark  2.5  It  is  well-know  that  if  ip  :  [-1. 1]  ]R  is  k  times  continuously  differentiable 
and  that  Y^n=oan  Pn{x)  denotes  its  Legendre  expansion,  then 


N-2 

¥>'(*)-  />„(*) 


n= 0 


and 


\[  <p(y)  dy  -  Y,  cn  Pn{x) 

1  J~l  n= 0 


=  0 


£). 


wAere  fte  coefficients  bn  and  Cn  are  defined  by  (26),  (28)  -  (31),  respectively. 


(32) 


(33) 


2.4  Legendre  Functions  of  the  Second  Kind 

The  Legendre  polynomial  Pn  (see  (17))  is  a  solution  of  the  Legendre  differential  equation 
(13).  The  other  solution  is  the  Legendre  function  of  the  second  kind  <?„  :  <D  \  [-1,1]  -»  <C, 
defined  by  the  three-term  recursion  formula 

Qn+l (z)  =  ~~Y  ■  z  ■  Qn(z)  -  ■  Qn^(z) ,  (34) 

with 

Qo(z)  =  i.l0g(|±|),  (35) 

Qi(^)  =  |  •  log  (j^y)  -  i .  (36) 

Clearly,  Qn(z)  has  a  branch  cut  in  the  complex  z-plane  on  the  real  axis  from  —1  to  1.  In 
agreement  with  standard  practice,  on  the  branch  cut  we  define  Qn  :  [— 1. 1]  — >  IR  by  the 
formula 

Qn{x)  =  -  Jfi‘n(Qn{x  +  ih)  +  Qn(x-ih)').  (37) 

The  following  theorem  is  known  as  Neumann’s  integral  representation  (see,  for  example, 

[S])- 

Theorem  2.6  Suppose  that  Pn  :  [-1,1]  IR  denotes  the  n-th  Legendre  polynomial,  and 
Qn  ■  [-1,1]  ->  IR  the  n-th  Legendre  function  of  the  second  kind  defined  by  formula  (37). 
Then,  for  any  point  y  £  (— 1,1) 

p-v-/1^~dx  =  2^(y).  (38) 

The  following  theorem  follows  immediately  from  Neumann’s  integral  representation  (38) 
and  provides  two  formulae  that  will  be  subsequently  used  in  this  paper. 
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Theorem  2.7  Suppose  that  Pn  :  [—1, 1]  — >  ]R  denotes  the  n-th  Legendre  polynomial,  and 
Pn  '■  [— 1)  1]  R.  its  primitive  function  defined  by  the  formula 


Pn{x)  —  f  Pn{y)  dy .  (39) 

Furthermore ,  suppose  that  Qn  :  [—1.1]  — >  IR  denotes  the  n-th  Legendre  function  of  the 
second  kind  defined  by  (37).  Then ,  for  any  point  y  g  (-1, 1) 


/i  ^ 

1  2  'loS  ((y-^)2)  -Pn{x)dx 

i.v.f  ™  - 


-i  {y  -  x)2 


log  ((»  -  l)2)  +  P-V.  5lM  ix 

p.v.  r 

J—i  %  y  y  —  1  y  +  1 


(40) 

(41) 


2.5  Chebyshev  Systems 

Definition  2.3  X  set  of  continuous  functions  <px ,...,ipN  is  referred  to  as  a  Chebyshev 
system  on  the  interval  [a,  b)  if  the  determinant 


(  Vi{x\)  ■ 

••  <Pl(Xff)  \ 

V  Vn(x i)  •• 

••  <Pn(xn)  J 

is  nonzero  for  any  set  of  points  such  that  a  <  x  \  <  x2  <  ...  <  x n  <  b. 


(42) 


Definition  2.4  Given  a  set  of  real  numbers  xx  <  x2  <  . . .  <  xN,  suppose  that  m1;  m2,  . . ., 
mN  denotes  the  natural  numbers  defined  by  the  formulae 


m\ 


mj 


0, 

{0 ,  for  j  >  1  and  xj  jk  xj-X , 

j  -1,  for  j  >  1  and  Xj  =  Xj- 1  =  . . .  =  X\ , 

A>  for  j>k  +  1  and  Xj  =  xj-i  =  ...=  Xj_k  jk  x . 


(43) 

(44) 


A  set  of  continuously  differentiable  functions  <px,...  ,<pN  is  referred  to  as  an  extended  Cheby¬ 
shev  system  on  the  interval  [a,  6]  if  the  determinant 


(  aPmtei) 


dmN 

dxTnN 


7<Pl  (XN) 


|  \  &Vn{x i)  • 

in  which  fgps<Pi(xj)  =  <pi(xj),  is  nonzero  for 
x2  <  . . .  <  xn  <  b. 


£m7T<PN{XN)  J 


(45) 


any  set  of  points  xx,...,xN  such  that  a  <  xx  < 
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Remark  2.8  Obviously,  an  extended  Chebyshev  system  also  forms  a  Chebyshev  system. 
The  additional  constraint  is  that  the  points  xx,x2,  ...,xN  at  which  the  functions  are  evalu¬ 
ated  may  be  identical.  In  that  case,  for  each  duplicated  point,  the  first  corresponding  column 
contains  the  function  values,  the  second  column  contains  the  first  derivatives  of  the  func¬ 
tions,  the  third  column  contains  the  second  derivatives  of  the  functions,  and  so  forth. 

In  the  following  examples  several  important  cases  of  Chebyshev  and  extended  Chebyshev 
systems  are  presented  (additional  examples  can  be  found  in  [10]). 

Example  2.1  The  monomials  l,x,  x2, . . .  ,xn  form  an  extended  Chebyshev  system  on  any 
interval  [a,  6]  C  (—00,00). 

Example  2.2  The  exponentials  e~^x,  e~^x, . . . ,  form  an  extended  Chebyshev  sys- 
tem  for  any  Ai,  A2, . . . ,  An  >0  on  the  interval  [0, 00). 

Example  2.3  The  functions  \,  cos(x),  sin(x),  cos(2x),  sin(2x),  . . .,  cos(nr),  sin(nx) 
form  a  Chebyshev  system  on  the  interval  [0, 2ir). 


2.6  Quadrature  Formulae 

A  quadrature  rule  on  the  interval  [—1,1]  is  an  expression  of  the  form 

N 

V(xn) ,  (46) 

n=l 

where  the  points  xn  E  [-1,1]  and  the  coefficients  wn  E  1R  are  referred  to  as  the  nodes 
and  the  weights  of  the  quadrature,  respectively.  The  quadrature  rule  IN{p)  serves  as  an 
approximation  to  integrals  of  the  form 

Hv)  =  f  w{x)  ■  <p(x)  dx,  (47) 

where  <p  :  [-1, 1]  ->•  1R  is  a  sufficiently  smooth  function  and  w  :  [-1, 1]  ->  ]R  is  some  fixed 
weight  function.  Since  we  will  permit  the  function  w  to  be  strongly  singular,  the  integral 
(47)  has  to  be  evaluated  in  the  appropriate  sense.  In  particular,  for  w{x)  we  will  consider, 
inter  alia,  the  singular  functions 


\  ■  log  (( V  ~  *)2)  , 


(48) 


1 


(y  -  x)2  ’ 

where  y  6  (-1, 1).  For  the  latter  two  functions,  the  integral  (47)  is  interpreted  as  a  principal 
value  integral  (see  (7))  and  finite  part  integral  (see  (10)),  respectively. 
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Definition  2.5  A  quadrature  formula  (46)  for  the  integral  (47)  is  said  to  be  of  the  degree 
M  >1,  if  it  integrates  all  polynomials  up  to  degree  M  exactly. 

Normally,  the  degree  of  a  quadrature  formula  (46)  can  not  exceed  2N  -  1  (see,  for  exam¬ 
ple,  [20]).  Quadrature  rules  (46)  of  degree  2 N  —  1  are  commonly  referred  to  as  Gaussian 
quadrature  rules.  The  following  theorem  is  well-known  and  can  be  found  in  most  elementary 
textbooks  on  numerical  analysis  (see,  for  example,  [20]). 

Theorem  2.9  (Gaussian  quadrature)  Suppose  that  w(x)  =  1  for  all  x  €  [—1,1].  Then 
there  exists  a  unique  quadrature  rule  (46)  which  has  the  degree  2N  -  1.  Furthermore ,  the 
nodes  xx,x2,...,xN  are  the  zeros  of  the  N-th  Legendre  polynomial  PN(x)  (see,  (17)),  and 
the  weights  wx,w2, . . . ,  are  all  positive  and  given  by  the  formula 

<5i> 

2.7  Generalized  Gaussian  Quadrature 

Numerical  quadratures  are  normally  constructed  such  that  the  quadrature  rule  (46)  is  ex¬ 
actly  equal  to  the  integral  (47)  for  some  set  of  functions.  Classical  iV-point  Gaussian  quadra¬ 
tures  (see,  Theorem  2.9)  integrate  polynomials  of  order  27V  -  1  exactly.  In  [14],  the  notion 
of  Gaussian  quadrature  was  generalized  as  follows. 

Definition  2.6  Suppose  that  w  :  [—1,1]  — *  H  is  a  non-negative  integrable  function.  A 
quadrature  rule  (46)  will  be  referred  to  as  Gaussian  with  the  respect  to  a  set  of  2N  functions 
<Pi,  <P2,  •••;  <P2 N  ■  [-1,1]  ->  R  and  a  weight  function  w,  if  it  consists  of  N  weights  and 
nodes,  and  integrates  the  functions  wo  ^  on  [-1, 1]  exactly  for  all  i  =  1,2,...,  2N.  The 

weights  and  the  nodes  of  a  Gaussian  quadrature  will  be  referred  to  as  Gaussian  weights  and 
nodes,  respectively. 

The  following  theorem  states  that  the  Gaussian  quadrature  with  respect  to  a  set  of  functions 
T\i  T2i  •  •  • , <P2N  exists  and  is  unique  if  the  set  tpx,  <p2, . . . ,  ip2N  forms  a  Chebyshev  system 
(see  Definition  2.3).  It  is  proved  (in  a  slightly  different  form)  in  [10.  13], 

Theorem  2.10  Suppose  that  the  functions  <px,  g>2,  ...,  <p2N  :  [-1.1]  ->  R  form  a  Cheby¬ 
shev  system  (see  Definition  2.3)  on  the  interval  [-1,1],  and  that  the  weight  function  w  : 
[-1, 1]  ->  IR  is  non-negative  and  integrable.  Then  there  exists  a  unique  Gaussian  quadrature 
with  respect  to  the  set  ipx,  ip2,  . . ip2N  and  the  weight  function  w.  Furthermore,  the  weights 
of  this  quadrature  are  all  positive. 

From  Definition  2.6  it  immediately  follows  that  the  Gaussian  quadrature  with  respect 
to  the  functions  ipx,  <p2,  . . .,  ip2N  :  [-1, 1]  ->  IR  and  the  weight  function  w  :  [-1, 1]  — ^  IR  is 
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defined  by  the  system  of  equations 


N 

5Z  wn-<Pl{xn) 

71=1 

N 

Wn  ■  <p2(xn) 

71=1 


N 

J2  '  V2N{Xn) 

71=1 


/  ™(z)  •  y>i(x)dx, 
[  w(x)  ■  <p2{x)  dx , 

J  —  1 


f  w(x)  -  ip2N(x)dx. 

J  —  l 


(52) 


We  denote  the  left  hand  sides  of  these  equations  by  fi,  f2, . . . ,  f2N;  each  of  the  ft's  being 
a  function  [-1,1]  x  ]R  ->•  1R  of  the  nodes  xi,x2,...,xN  and  weights  wu  w2, . . . ,  wN, 
respectively.  Their  partial  derivatives  are  given  by  the  formulae 


dft_ 

dwn 

dxn 

so  that  the  Jacobian  of  the  system  (52) 


=  <Pi(x n) , 

=  Wn  (fiixn)  , 

takes  the  form 


(53) 

(54) 


J{x  1,-  •  .,xjf,w\,. .  .,wn)  = 


\  V2n{x\) 


<Pi{xn)  Wi<p'x(xi)  •••  WNip\(xN)  \ 
<P2n{xn)  W\  v'2N{x\)  •••  wN<p'2N(xN )  / 


(55) 


In  practice,  the  system  (52)  is  solved  via  Newton's  method  (see,  for  example,  [5]).  The 
following  theorem  states  that  when  the  functions  to  be  integrated  constitute  an  extended 
Chebyshev  system,  Newton's  method  for  this  system  is  always  quadratically  convergent 
provided  the  starting  point  for  the  iteration  is  within  a  sufficiently  small  neighborhood  of 
the  solution.  A  proof  can  be  found  in,  for  example,  [5], 


Theorem  2.11  Suppose  that  the  functions  ip\,  ,<p2N  form  an  extended  Chebyshev 

system  (see  Definition  2-4).  Suppose  further  that  the  Gaussian  quadrature  nodes  and  weights 
for  these  functions  are  denoted  by  xux2,...:  xN  and  wi,w2,  ...,wN,  respectively.  Then  the 
determinant  of  the  Jacobian  matrix  (55)  is  nonzero  at  the  point  (zi, . . .  ,xn,Wi,  ■ . .  ,ujn), 


\J{xi,...,xn,wi,...,wn)\  ^  0.  (56) 

Furthermore,  the  nodes  xux2,  ...,xN  and  the  weights  wuw2,  ...,wN  depend  continuously 
on  the  weight  function  w.  ? 


Remark  2.12  In  order  for  Newton’s  method  to  converge,  the  starting  point  must  be  within 
a  sufficiently  small  neighborhood  of  the  solution.  In  [5]  the  continuation  method  (sometimes 
also  referred  to  as  the  homotopy  method)  is  used  to  generate  such  starting  points. 
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2.8  Singular  Value  Decomposition  of  a  Set  of  Functions 

The  following  theorem  generalizes  the  standard  singular  value  decomposition  of  a  matrix 
to  a  set  of  functions.  A  proof  can  be  found  (in  a  more  general  form),  for  example,  in  [17], 

Theorem  2.13  Suppose  that  the  functions  ip\,  y>2,  <Pn  ■  [<2.,  6]  — >  ]R  are  square  inte¬ 
grate.  Then  for  some  integer  M  there  exist  an  orthonormal  set  of  functions  U\,U2, . . .  ,um  : 
[a,  b]  1R,  an  N  x  M  matrix  V  =  [wy]  with  orthonormal  columns,  and  a  set  of  real  numbers 
si  >  $2  —  •  •  •  —  SM  >  0,  such  that 

M 

Vj(x)  =  >  (57) 

i=i 

for  all  x  £  [a,  5]  and  all  n  =  1,2, ....  N . 

By  analogy  to  the  well-known  singular  value  decomposition  of  matrices,  we  will  refer  to  the 
factorization  (57)  as  the  singular  value  decomposition  of  the  set  of  functions  <pi,  q?2- 
ipx,  the  functions  u\,u2, ...  ,uM  as  the  singular  functions,  the  columns  of  the  matrix  V  as 
singular  vectors,  and  the  numbers  si  >  S2  ^  ^  SM  as  the  singular  values,  respectively. 

The  following  theorem  from  [5]  states  that  the  accuracy  of  a  quadrature  formula  with 
positive  weights  for  the  functions  y>i,<p2,  ■  ■  ■ ,  <Pn  is  determined  by  its  accuracy  for  the  sin¬ 
gular  functions  corresponding  to  non-trivial  singular  values. 

Theorem  2.14  Suppose  that  under  the  conditions  of  Theorem  2.13  there  exist  a  positive 
real  number  e  and  an  integer  1  <  M0  <  M,  such  that 

M  o 

E  si<J •  (58) 

i=Mo+l 

Suppose  further  that  the  L-point  quadrature  rule  with  nodes  x\,  X2,...,xl  and  weights 
W\,W2, . . .  ,wl  integrates  the  functions  ut  exactly  on  the  interval  [a,  6],  i.e. 

L  rb 

22wj-Uiixj)=  Ui(x)dx  (59) 

j= i  Ja 

for  all  i  —  1,2, ... ,  Mq,  and  that  the  weights  w\,  W2,  ■ . . ,  toj,  are  all  positive.  Then  for  each 

i  =  l,2,...,N, 

L  '  rb 

22  Wj  ■  ipi{xj)  -  <pi{x)dx 

j= l  Ja 


<  6  ■  ll^ilb  • 


(60) 


3  Analytical  Apparatus 

The  principal  purpose  of  this  paper  is  to  construct  quadrature  formulae  for  functions  /  : 
[— 1, 1]  — >  ]R  of  the  form 

f{x)  =  (p(x)  +  ip(x)-\og\x\  +  '^-  +  ^-,  (61) 
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where  i/>,  rj,  6  :  [  1, 1]  ->  IR  are  smooth.  In  Section  3.1,  we  construct  separate  quadrature 
formulae  for  each  of  the  functions  of  the  form 


<?(x),  #r)-log|a:|,  2^1,  (62) 

m  Section  3.2,  we  present  a  scheme  were  each  quadrature  it  produces  can  be  used  simulta¬ 
neously  for  the  efficient  numerical  integration  of  functions  of  the  form  (61). 

Obviously,  integrals  of  the  form 


/-,  (”(I)  +  *<*>  '  l06d»  -  *l>  +  ~  dx  (63) 

With  y  outside  the  interval  of  integration  [-1, 1]  and  the  functions  smooth,  can 

be  evaluated  with  standard  Gaussian  quadrature  formulae.  However,  when  y  is  sufficiently 
close  to  the  interval  of  integration  [-1, 1],  the  number  of  Gaussian  nodes  needed  to  achieve 
acceptable  accuracy  is  often  very  high.  Therefore,  more  specialized  quadratures  are  desirable 
in  this  case;  Section  3.3  is  devoted  to  the  design  of  generalized  Gaussian  quadratures  for 


3.1  Quadrature  Formulae  for  Individual  Singularities  log|x|,  — 

’  x'  x2 

The  following  theorem  is  one  of  principal  analytical  tools  used  in  this  paper. 

Theorem  3.1  Suppose  that  Xl,x2,...,xN  and  wu  w2, . . .  ,wN  denote  the  N  nodes  and 
weights  of  the  Gaussian  quadrature  on  the  interval  [-1,1],  respectively  (see,  Theorem  2.9). 

uppose  further  that  Pj{ x)  denotes  the  j-th  Legendre  polynomial  (see,  (17)),  and  that  w(x)  ■ 
Pj(x)  is  integrable  on  [-1, 1]  for  all  j  =  0, 1,. . . ,  N  -  1.  Then  the  quadrature  rule 

r\  N 

J  w{x)-<p(x)dx  fin- <p{xn)  (64) 

n=l 

with  the  weights  wn  defined  by  the  formula 

Wn=wn-  (~^2  PAXn)  '  ( f_^  w(x)  Pj(x)  dxjj  (65) 

has  the  degree  N  —  1. 

Proo/  Suppose  that  <p  :  [-1, 1]  -»•  ]R  is  a  polynomial  of  order  TV  - 1  given  by  its  Legendre 
series  (21)  so  that  6  J  6 

N- 1 

<p(x)  =  £  ajPj(x).  (66) 

3= 0 
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Substituting  (66)  into  (47).  we  obtain 


7(^)  =  w(x)  ■  <p(x)  dx  =  j  ^  w{x )  •  ^  aJ  Pj(x)j  dx 

/  rl  \ 

=  E  ai  ■  (  /  w(x)  pj(x )  dx )  . 

j= o  W_1  ' 


(67) 


The  coefficients  aj  are  given  by  (20).  Evaluating  the  integral  (20)  via  77-point  Gaussian 
quadrature  (see  Theorem  2.9),  we  obtain  the  identity 


2  j  jf-  2  ri  2  7  |  i  ^ 

ai  =  -~2~  ]_,  ^  P^X)  dX  =  2~  E  Wn  ■  <P(xn)  ■  Pj(xn)  ,  (68) 

n=l 

for  all  j  =  0, 1, . . . ,  N  -  1.  Finally,  substituting  (68)  into  (67)  we  obtain 

j_w(x)-<p{x)dx  =  ZVM-Viff!  (^^VM't/^WPjixldx^m 
from  which  (64)  and  (65)  immediately  follow.  □ 

Remark  3.2  If  the  function  <p  is  k  times  continuously  differentiable,  it  immediately  follows 
from  the  Cauchy -Schwartz  inequality  and  (23)  in  Lemma  2.3  that 


I  ( 1  N 

I  J  w(x)  ■  <p(x)  dx~Y,Wn-  <p(xn) 

71  =  1 

The  following  theorem  extends  Theorem  3.1  to  the  case  when  the  function  w  :  [-1, 1]  ]R 
is  defined  by  one  of  the  formulae  (48)  -  (50).  The  latter  two  functions  are  not  integrable 
m  the  classical  sense,  and  the  integral  (47)  is  interpreted  as  a  principal  value  integral  (see 
(7))  and  finite  part  integral  (see  (10)),  respectively.  The  theorem  follows  immediately  from 
the  combination  of  Theorems  2.4,  2.7,  3.1. 

Theorem  3.3  Suppose  that  xx,x2,...,xN  and  wx,w2,...,wN  denote  the  N  nodes  and 
weights  of  the  Gaussian  quadrature  on  the  interval  [-1,1]  (see,  Theorem  2.9).  Suppose 
further  that  <p  :  [-1,1]  -+  IR  is  a  sufficiently  smooth  function,  and  Pffx),  Qffx)  denote 
the  j-th  Legendre  polynomial  and  Legendre  function  of  the  second  kind  (see,  (17),  (37)), 
respectively.  Finally,  suppose  that  the  coefficients  wXjX,wXt2, . . .  ,wljN,  w2, i,w2, 2,.. ’.  ,w2n’, 
W3,1,W3,2,...  ,w3'N,  are  defined  by  the  formulae 

N—l 

Wl’n  ~  Wn  '  E  (27  +  !)  ■  Pj{xn)  •  Qj{y),  (71) 

3= 0 

^2,n  =  wn  •  ((p0(®„)  -  Pl(xn))  •  Ro(y)  +  £  (Pj.X(xn)  -  Pj+l(xn))  ■  Rffy) 

j= 1  ' 


O 


(70) 
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(72) 


+PN-2(xn)  •  Rff-iiy)  +  PN-i(xn)  ■  RN{yij  , 


for  all  n  -  1, 2, . . . ,  N,  with  [~+^  3]  denoting  the  integer  part  of  3,  and  the  mappings 
Rj  :  (—1.1)  — >  ]R  defined  by  the  formula 


Rj(y )  =  Qj(y )  + 1  •  log  ((y  - i)2) . 

Then ,  /or  any  pomi  y  E  (— 1, 1);  i/ie  quadrature  rules 

p,v'  /_i  ^  ~  t  ■  '’f1")  • 

y  n=l 

71  1  /  * 

J  2  ’  l0g  ”  ^2j  ‘  ^  d:r  *  53  W2>n  ■  ¥>(*.»)  , 

n=l 

f1  <p(x)  J  A 

P-  7-1  (^F  ^  ~  E  W3,n  •  <p(xn)  , 

/lave  the  degree  N  —  1,  N  —  2,  and  N  —  1,  respectively. 


(74) 


(75) 

(76) 

(77) 


3.2  Quadrature  Formulae  for  Functions  of  the  Form  <p(x)  +  \b(x)  •  log  Ixl 
tr}{x)  d(x)  K)  611 

+  x  + 

Theorem  3.3  provides  a  tool  for  the  numerical  integration  of  functions  of  the  form 


tp{x)  •  log|x|, 

*l(x) 

0(x) 


However,  integrands  are  frequently  encountered  of  the  form 


(78) 

(79) 

(80) 


f(x)  =  <p(x)  +  i>{x)  ■  log  |x|  +  +  ~~fr  >  (81) 

where  the  functions  tp,  ip,  rj ,  6  are  known  to  be  smooth  but  are  not  available  individually. 
Specifically,  in  the  numerical  solution  of  scattering  problems,  one  is  frequently  confronted 
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with  the  need  to  evaluate  integrals  of  the  form 
f-P  J  K(x,y)  ■  a(x)  dx  = 

=  f-P-  /  fcifoy)  +  K2(z,y)  •  log(|x  -  y|)  +  ^3(g’y)  +  •  a(x)  dx .  (82) 

•/-l  V  y  -  x  (y-rr)2/ 

where  a  :  [-1,1]  -»  R  and  ifi(x,y),it:2(x,y),if3(x,y),X4(x,y)  :  [-1,1]  x  [-1,1]  ->  ]R  are 
smooth  functions,  and  y  €  (-1, 1).  Normally,  the  functions  KX,K2,K3,  KA  are  not  available 
separately,  so  that  only  the  kernel  K  in  toto  can  be  evaluated.  In  such  cases,  a  single 
quadrature  rule  integrating  functions  of  the  composite  form  (81)  is  clearly  preferable.  Even 
when  each  of  the  functions  <p,  ip,  rj,  0  is  available  separately,  the  numerical  implementation 
is  simplified  when  a  single  quadrature  formula  can  be  used. 

Given  a  real  number  y  €  (-1, 1),  we  denote  by  ipx,ip2, . . .  ,ipiM  the  functions  [-1, 1]  -»  IR 
defined  by  the  formulae 


ipi{x)  =  { 


•Pi-i  (a:), 

Pi-M- i(x)  •  log(|y  -  x\) , 


Pi-2M-l(x)  -  - , 

y  -x 

Pi—ZM—\ (x)  •  - - -r  , 

(y  -  xy 


for  i  =  1,...,M, 
for  i  =  M  +  1, . . . ,  2  M  , 

for  i  =  2M  +  1, . . . ,  ZM , 
for  i  =  3  M  +  1, . . . ,  4  M  . 


(83) 


In  a  minor  generalization  of  the  standard  terminology,  we  define  the  generalized  moments 
mi(j/)>  ^2 (y), . . .,  m4w(y)  by  the  formulae 


mi{y) 


J"  J  i  -Pi-i(x)  dx , 

J  Pi—M—i (x)  •  log ( | y  —  x|)  dx  , 

p.v!  [' 

J-i  y-x 


for  i  =  1, . . .  ,M, 
for  i  =  M  +  1, . . . ,  2M , 
for  i  =  2M  +  1,...  ,3M, 
for  i  =  3M  +  1, . . . ,  4M . 


(84) 


Now,  suppose  that  xi,x2, . . .  ,x^  denotes  the  N  Legendre  nodes  on  [-1,1]  (see  (19)). 
Then  we  define  the  weights  w\,  w2, . . . ,  wpj  of  the  quadrature  formula 


,1  N 

/  /(x)dx«  £>n-/(xn) 

•/_1  n=l 

as  the  solution  of  the  system  of  the  4 M  linear  algebraic  equations 

N- 

^  Wn  •V)1(xn)  =  m\{y) , 

71=1 


(85) 
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(86) 


N 

^  wn  ■  ip2(xn)  =  m2{y), 

71  =  1 


N 

'  ‘’PlMiXn)  =  m4Af(y). 

n=l 

Obviously,  the  matrix  of  the  system  (86)  might  be  square,  or  it  might  be  over-  or  under¬ 
determined,  depending  on  the  values  of  the  parameters  M,  N.  On  the  other  hand,  given 
a  solution  w\.w2. . . .  ,wn  of  (86),  we  can  be  sure  that  the  quadrature  formula  (85)  will 
integrate  exactly  all  functions  /  of  the  form  (81),  as  long  as  the  functions  <p,  ip,  rj,  6  are 
polynomials  of  order  not  greater  than  M  -  1.  Due  to  Theorem  A.6  in  Appendix  A  be¬ 
low,  Jot  sufficiently  large  N,  there  always  exist  multiple  solutions  of  (86),  and  a  solution 
wi,w2, ...  ,wn  can  be  found  such  that 


A’  N 

E  <  C  •  Y,  wl ,  (87) 

n=l  n=l 

where  wuw2,...,wN  are  the  weights  of  the  iV-point  Gaussian  quadrature  and  C  is  a  positive 
real  constant.  In  practice,  least  squares  are  used  to  find  wi:  w2, . . . ,  wn  satisfying  the  bound 
(87)  (see  Section  4  below).  Denoting  the  N  x  AM  matrix  of  system  (86)  by  A  and  its 
right-hand  side  by  6,  we  rewrite  (86)  in  the  form 

Aw  =  b.  (88) 


3.3  Generalized  Gaussian  Quadrature  Formulae  for  Functions  of  the  Form 

<p(x)  +  ip{x)  ■  log  |x|  -f 

In  Section  3.2  we  described  the  quadrature  formula  (85)  for  integrals  of  the  form  (82)  where 
the  point  of  evaluation  y  is  inside  the  interval  of  integration.  While  standard  numerical 
quadratures  (eg.  Newton-Cotes  or  Gaussian  quadratures)  can  be  used  for  integrals  of  the 
form  (82)  when  the  point  of  evaluation  y  is  outside  and  sufficiently  far  away  from  the  interval 
of  integration,  more  specialized  quadratures  are  desirable  when  y  is  outside  but  close  to  the 
interval  of  integration. 

Given  two  positive  real  numbers  d  and  R  such  that  d  <  R,  we  will  denote  by  DRd 
the  set  [— R, -1  —  d\  U  [1  +  d. i?]  (see  Figure  1).  We  define  the  functions  t/>i,  ip2,  ..., 
^4 M  '■  [—1,1]  x  DRd  IR  by  the  formulae 


ipi{x,y) 


< 


Pi-i(x), 

Pi-M- i(x)  •  log(|y  —  x|) , 


Pi-2M-l(x) 

Pi-ZM-l(x) 


1 


{y-x)2  ’ 


for  i  = 

for  i  =  M  -f  1, . . . ,  2  M , 
for  i  =  2M  +  1, . . . ,  ZM , 

for  i  =  3M  +  1,...,4M, 


(89) 
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where  Pj  denotes  the  j-th  Legendre  polynomial  (17). 

Now.  suppose  that  ylt  y2,  ...,yK  axe  points  in  DR4.  We  will  denote  by  rjtJ  :  [-1. 1]  ^  IR 
the  4  •  K  ■  M  functions  defined  by  the  formula 


Thj(x)  =  ipi{x,yj)  (90) 

where  i  =  1, 2, . . . ,  4  M  and  j  =  1,2 ....  ,K.  Since  it  will  be  convenient  to  view  the  functions 
rjij  as  a  finite  sequence  of  functions  [-1, 1]  ->•  JR,  we  introduce  the  notation 

k  =  4(j  -1)M  +  i,\  (91) 

so  that 


i 

j 


k  -4(j  —  1)M , 


k  —  i 
~4M 


+  1. 


(92) 

(93) 


In  a  mild  abuse  of  notation,  we  will  use  and  Tjij  interchangeably. 

Due  to  Theorem  2.13,  there  exist  orthonormal  functions  iti,  u2, . . . ,  ul  :  [—1,1]  1R.  a 

matrix  V  €  TR4  K'MxL  with  orthonormal  columns,  and  real  numbers  si  >  s2  >  •  •  •  >  si  >  0. 
for  some  integer  L  <  4  •  K  ■  M,  such  that 


for  all  k  =  1, 2, . . . ,  4  •  K  ■  M. 


L 

Vk(x)  =  Ylui(x)sivik 

t=i 


(94) 


Remark  3.4  For  an  arbitrary  positive  real  number  e,  we  will  denote  by  n(e)  the  number 
of  coefficients  Si  in  the  decomposition  (94)  such  that  s»  >  e.  It  turns  out  that  for  fixed  d 
and  R,  n(e)  is  proportional  to  log(j),  and  is  virtually  independent  of  K.  For  a  fixed  e, 
n(e)  is  proportional  to  log(f),  and  is  virtually  independent  of  K.  The  behavior  of  n(e)  as 
a  function  e,  d,  R  is  investigated  in  detail  in  [22]. 


The  following  theorem  is  an  immediate  consequence  of  Theorems  2.11,  2.14. 


Theorem  3.5  Suppose  that  for  a  sufficiently  large  integer  number  K,  yi,y2,---,yK  are 
points  in  DRd  such  that  yi  yj  for  all  i  ^  j.  Suppose  further  that  the  functions  rji, 
V2,  Vakm  ■■  [-1,1]  H,  the  real  positive  numbers  sus2,. . .  ,sL,  and  the  functions 
ui,u2,...,uL  :  [-1,1]  -¥  1R  are  defined  by  the  formulae  (90),  (94),  respectively.  Given  a 
positive  real  number  e,  we  denote  by  L0  the  smallest  even  integer  such  that  1  <  L0  <  L  and 


£ 

i — Lq+1 


^<4- 


(95) 
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Then  there  exists  a  unique  solution  (wi,...wlsi,x\,. 

2 


)  of  the  non-linear  system 


2 

^  ^  ^ n  *  ^l(^7i) 

n=l 

2 

Y1  Wn  '  “2(zn) 

n=l 


f  ui(x) 

J  i u 


dx , 


dx , 


2 

n=l 


(96) 


u/Aere  a//  u>n,  n  —  1, 2, ....  are  positive.  Furthermore,  for  each  k  =  1, 2, . . . ,  4  •  K  ■  M , 
the  kf- -point  quadrature  rule 


hL 

2  rl 

J2  wn  ■  Vk{xn)  ~  r]k(x)dx, 

71=1  7-l 


Aas  relative  accuracy  e;  that  is 


(97) 


2  ci 

YlW^'Vk{xn)-  T}k{x)dx 

71=  1  7-1 


<  e-  ||%||2. 


(98) 


Remark  3.6  The  solution  of  the  system  of  non-linear  equations  (96)  can  be  found  by  New¬ 
tons  method.  For  a  detailed  discussion  of  a  Newton  method  for  non-linear  systems  arising 
in  the  construction  of  generalized  Gaussian  quadratures,  the  reader  is  referred  to  [5]. 


4  Numerical  Algorithm 

In  Sections  3.1,  3.2  we  have  described  quadratures  rules  for  integrands  of  the  form  (78) 
-  (81).  While  the  numerical  evaluation  of  the  weights  of  the  quadratures  (75)  -  (77)  in 
Section  3.1  via  the  formulae  (72)  -  (73)  is  straightforward,  the  evaluation  of  the  weights 

W\,W2,...  ■  wp,’  of  the  quadrature  (85)  is  more  involved;  we  summarize  the  computational 
procedure  below. 

The  input  to  the  algorithm  is  a  real  number  y  <E  (-1, 1),  a  natural  number  N  where  N  is 
the  number  of  Legendre  nodes  (19)  on  the  interval  [-1, 1],  and  a  natural  number  M  where 
M  1  is  the  degree  of  the  quadrature  rule.  The  algorithm  will  then  compute  quadrature 
weights  w\,  w2, . . . ,  wpj,  such  that 


where  tp  :  [  1. 1]  — >  IR  is  smooth  and  w  :  [—1. 1]  — >■  1R  is  a  linear  combination  of  smooth 
functions  and  functions  of  the  form  (48)  -  (50).  respectively.  It  consists  of  the  following 
steps: 

1.  Construct  the  N-point  Gaussian  nodes  X\,X2, . . .  ,xjv  and  weights  wx  ,w2,...wN  on 
the  interval  [-1, 1]  (see  Theorem  2.9). 

2.  Evaluate  the  Legendre  polynomials  P0,  Px, . . . ,  PM-\  at  the  nodes  xi,x2,...,  xA-  via 
the  three-term  recursion  (14). 

3.  Evaluate  all  the  functions  ipi,ip2, . . .  ,z/>4M  (see  (83))  at  the  nodes  xx,x2, . . .  ,xn. 

4.  Construct  the  moments  mi  (y)  ,7712(1/), . . .  (see  (84))  exactly,  using  Gaussian 

quadrature  for  mi,  m2,  ...,  mM  and  quadrature  rules  (75)  -  (77)  for  mM+i(y), 
mM+2(y),  •••,  ThAMiy),  respectively. 

5.  Solve  the  linear  algebraic  system  (88)  in  the  least  squares  sense  with  any  standard 
routine  (available,  for  example,  in  LAPACK  [2]). 

5  Numerical  Examples 

FORTRAN  codes  have  been  written  constructing  the  quadratures  described  in  Sections  3.1, 
3.2,  3.3;  in  this  section,  their  performance  is  illustrated  with  several  numerical  examples. 
In  all  examples  below  the  quadrature  nodes  and  weights  are  first  computed  in  extended 
precision  arithmetic  (REAL  *16)  to  assure  full  double  precision  accuracy.  The  quadrature 
rules  are  then  used  in  double  precision  (REAL  *8)  to  numerically  integrate  a  number  of 
functions  with  singularities  log|x|,  j,  4j-. 

Example  5.1  In  the  first  example,  we  use  the  quadrature  rules  (75)  -  (77)  to  evaluate 
integrals  of  the  form  (47)  for  each  of  the  singularities  (48)  -  (50)  with  the  function  (p  : 
[—  1, 1]  — ►  IR  defined  by  the  formula 


<p(x)  =  sin(2x)  +  cos(3x) , 

so  that  the  actual  functions  to  be  integrated  are  of  the  form 

(100) 

log(|x  -  y|)  •  (  sin(2  x)  +  cos(3  x))  , 

(101) 

y_x  '  (sin(2x)  +  cos(3x))  , 

(102) 

(y-x)2  '  (sm(2x)  +  cos(3x))  . 

(103) 

We  denote  by  yx.  y2, . . . ,  yu  the  14  Legendre  nodes  on  the  interval  [—1,1]  (see  (19)).  The 
integrals  of  (101)  -  (103)  were  evaluated  at  yuy2, . . .  ,y14,  and  the  relative  errors  in  the  l 2 
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norm  were  obtained  via  the  formula 


VSi-E^te) 
VEiA<p)iyi?  : 


where  Eabs(yi )  and  I((p)(yl)  denote  the  absolute  error  and  the  exact  integral  (47)  evalu¬ 
ated  at  the  point  yi:  respectively.  The  integrals  I(<p)  (y;)  were  computed  analytically  using 
MATHEMATICA.  J  s 

In  Figure  2,  the  relative  errors  of  the  integrals  of  (101)  -  (103)  are  presented  for  N  = 
6, 8, ,  26.  For  comparison,  the  relative  errors  of  the  Appoint  Gaussian  rules  (see  Theorem 
2.9)  with  IV  =  6, 8, ....  26  applied  to  the  function  (100)  are  shown  as  well. 


Remark  5.1  The  weights  (see  (72)  -  (73))  of  the  quadrature  rules  (75)  -  (77)  used  in 
Example  5.1  above,  depend  upon  the  point  of  evaluation  y.  Therefore,  for  the  evaluation 
of  each  of  the  integrals  (101)  -  (103)  at  each  of  the  points  Vl,  y2, . . .  ,y14,  a  different  set 
of  quadrature  weights  is  used.  As  an  example,  in  Table  1  we  list  the  quadrature  nodes 
xn  and  weights  w\^n,  w2,n,  ^3,n  of  the  14 -node  version  of  the  quadratures  (75)  -  (77) 
for  the  integration  of  functions  with  singularities  log(|x  —  t/i | )  — _  1  7  with  y  = 

-0.9862838086968123  (the  smallest  of  the  U  Legendre  nodes  on  [-l^])^  ’ 

Example  5.2  In  this  example,  we  compute  the  same  integrals  as  in  Example  5.1.  However, 
this  time  we  use  the  quadrature  rule  (85)  that  integrates  functions  of  the  combined  form 
(81).  Specifically,  the  quadrature  weights  were  constructed  via  the  numerical  algorithm 
described  in  Section  4  for  integrands  of  the  form 

M 


£  (a*  +  bi  •  log(|yfc  -  x|)  +  — ^ 
i=i  Vk~x 


di 


—  x)2)  (x)  > 


(yk  -  x) 


(105) 


for  each  Legendre  node  yk,  k  =  1,2,...,  14,  on  the  interval  [-1,1]  (see  (19)).  In  our 
computations,  we  chose  the  number  of  weights  N  equal  to  6 M. 

In  Figure  3  the  relative  errors  (see  (104))  are  presented  for  N  =  36. 48, . . . ,  144 

Example  5.3  In  this  example,  we  use  the  generalized  Gaussian  quadrature  described  in 
Section  3.3  to  integrate  the  functions  (101)  -  (103)  where  y  is  a  point  outside  but  close  to 
the  interval  [—1,1].  Specifically,  36  and  42-node  versions  of  the  quadrature  formula  (97) 
were  constructed  for  integrands  of  the  form 


T  (<*  +  '  Mill  -  *l>  +  ~  +  ■  P,-,(x ) ,  (106) 

where  y  €  (-10, -1.0016]  U  [1.0016, 10],  The  36  and  42-node  versions  were  constructed  with 
M  ~  11  and  M  =  21,  respectively.  In  order  to  test  the  accuracy  of  the  resulting  quadra¬ 
tures,  the  integrals  (101)  -  (103)  were  evaluated  at  202  equispaced  points  yx,  y2, . . . ,  y2 02  G 
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[—2.002.  —1.002]  U  [1.002.2.002],  defined  by  the  formula 


Vk 


—2.002  +  0.01  ■  (k  —  1) .  for  k  =  1, _ 101 , 

1.002  +  0.01  •  (k  -  102) ,  for  k  =  102, . . . , 202 . 


(107) 


In  Table  2,  the  relative  errors  (see  (104))  of  the  JV-point  generalized  Gaussian  quadra¬ 
tures  with  N  =  36,42  applied  to  the  the  functions  (100),  (101)  -  (103)  are  presented. 
For  comparison,  the  relative  errors  of  the  IV-point  Gaussian  rules  (see  Theorem  2.9)  with 
N  =  36, 42, 100, 150, . . . ,  300  applied  to  the  same  functions  are  shown  in  Table  3.  In  Tables 
4,  5  we  list  the  quadrature  nodes  xn  and  weights  wn  of  the  36  and  42-node  versions  of  the 
quadrature  (97). 


Example  5.4  In  this  example,  we  use  a  compound  quadrature  formula  based  on  the  combi¬ 
nation  of  the  singular  quadrature  (85),  generalized  Gaussian  quadrature  (97),  and  Gaussian 
quadrature  (see  Theorem  2.9)  to  evaluate  the  integral 

/I  ,  11 

i  (l  +  logfly-s|)  +  — -  +  )  •  (sin(200rr)  +  cos(300:r))  dx .  (108) 

at  several  points  y  €  (—1,1).  Specifically,  we  subdivide  the  interval  of  integration  [-1.1! 
into  K  subintervals  I\, ...  ,1%  where 


!  +  -•(*-!), -1 


K 


■  i 


(109) 


for  all  i  1,2 ,...,K,  and  then  apply  a  specific  quadrature  rule  on  each  subinterval  to 
evaluate  (108).  The  quadrature  rule  used  on  subinterval  /*  is  determined  by  one  of  the 
following  criteria: 


•  if  V  £  h,  then  the  combined  singular  quadrature  rule  (85)  is  used; 

•  ^  V  &  h  and  y  €  It-\  U  T+i,  then  generalized  Gaussian  quadrature  (97)  is  used: 

•  if  y  0  and  y  &  /t_i  U  T+i,  then  Gaussian  quadrature  (see  Theorem  2.9)  is  used. 

We  denote  by 

Vi.l4i---.Vw  (110) 

the  M  Legendre  nodes  (see  (19))  on  subinterval  Ix.  Furthermore,  we  denote  by  tq,  y2 . 

yMK  the  set  of  all  points  (110)  from  all  subintervals  i  =  1, 2, . . . ,  K.  In  other  words. 

y)  =  VM(i-i)+j ,  (ill) 

where  i  =  1,2,..., K  and  j  =  1, ...,M.  Obviously,  by  evaluating  the  integral  (108)  at 
the  points  yi,y2, . . . ,  vmk  via  the  procedure  described  above,  we  obtain  approximations 
to  F(yi),  F{y2), . . . ,  F{yMK)-  We  perform  the  calculations  with  M  =  4,6,10,12,16  and 
K  =  2,4,8,?..  ,8192;  and  in  order  to  compare  the  accuracy  for  two  different  choices  of 
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K:  we  interpolate  the  obtained  values  with  an  M  order  interpolation  scheme  to  the  100 
equispaced  points  ti,  £2?  *  *  •  ?^ioo  the  interval  (—1, 1)  defined  by  the  formula 

<112) 

for  allz  =  1, . . . ,  100. 

In  Table  6,  the  relative  errors  (see  (104))  of  the  scheme  described  above  of  degrees 
M  =  4,6,10,12,16  and  the  number  of  subintervals  K  =  2, 4, 8, . . . ,  8192,  applied  to  the 
integral  (108)  are  presented. 


The  following  observations  can  be  made  from  the  examples  of  this  section,  and  from  the 
more  detailed  numerical  experiments  performed  by  the  authors. 

1.  The  quadrature  formulae  (85),  (97)  are  not  convergent  in  the  classical  sense;  they  are 
only  convergent  to  a  prescribed  precision  e.  Needless  to  say,  the  two  are  indistinguish¬ 
able,  as  long  as  the  prescribed  precision  is  less  than  machine  precision. 

2.  The  schemes  producing  the  quadrature  formulae  (75)  -  (77),  (97)  do  not  lose  many 
digits  compared  to  machine  precision;  constructing  the  quadratures  in  double  precision 
arithmetic  results  in  11  -  12  correct  digits;  constructing  them  in  extended  precision 
arithmetic  results  in  full  double  precision  accuracy.  Needless  to  say,  the  nodes  and 
weights  of  the  quadrature  formulae  (75)  -  (77),  (97)  can  be  (and  have  been)  precom¬ 
puted  and  stored,  so  that  the  need  for  extended  precision  during  the  construction  of 
the  quadrature  is  not  a  serious  limitation. 

3.  The  quadrature  formula  (85)  experiences  some  loss  of  precision,  not  only  during  the 
precomputation  of  the  nodes  and  weights,  but  also  when  the  formula  is  applied  to 
specific  functions  of  the  form  (81).  A  fairly  detailed  investigation  has  led  us  to  the 
conclusion  that  the  loss  of  precision  is  associated  with  the  evaluation  of  the  “hypersin¬ 
gular  function  (80),  and  is  unavoidable;  the  phenomenon  is  very  similar  to  the  loss 
of  precision  associated  with  numerical  differentiation,  both  in  character  and  severity. 

4.  When  the  quadrature  formulae  of  this  paper  are  applied  to  oscillatory  functions  (of 
the  form  (108),  or  similar),  they  achieve  their  full  precision  at  10  -  15  nodes  per 
wavelength  (for  the  formulae  (75)  -  (77),  (97)),  and  20-45  nodes  per  wavelength  (for 
the  formula  (85)),  respectively. 

6  Generalizations  and  Conclusions 

A  set  of  quadratures  has  been  constructed  for  functions  /  :  [-1, 1]  ->•  IR  of  the  form 

f(x)  =  <p{x)  +Tp(x)  ■  log|l|  +  ^  +  ®,  (113) 
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where  (p,  rp,  77, 6  :  [-1,1]  ->  ]R  axe  smooth  functions.  The  term  “quadratures’"  in  this 
case  is  somewhat  of  a  misnomer,  as  functions  of  the  form  (113)  are  not  integrable  in  the 
classical  sense,  and  their  integrals  are  to  be  interpreted  in  the  appropriate  “finite  part" 
sense.  One  of  anticipated  applications  for  such  quadratures  is  the  evaluation  of  integro- 
pseudo-differential  operators  (eg.  Hilbert  transform  and  derivative  of  Hilbert  transform) 
arising  from  the  solution  of  integral  equations  of  potential  theory  in  two  dimensions  (see. 
for  example,  [11,  12]). 

The  work  presented  here  admits  several  straightforward  extensions: 

1.  The  quadratures  in  this  paper  can  easily  be  modified  for  functions  with  singularities 
other  than  log|x|,  £,  J?.  For  example,  using  Chebyshev  polynomials,  quadrature 
formulae  similar  to  (75)  -  (77),  (85)  for  functions  with  singularities  of  the  form 

logM 

\/l  —  x2  ’ 

1 

x  \/l  —  X2  ’ 

1 

x2  vT  —  x2  ’ 

etc.  are  easily  constructed. 

2.  A  straightforward  generalization  of  the  quadratures  of  this  paper  in  two  dimensions 

leads  to  quadrature  formulae  on  the  square,  integrating  functions  /  :  [— 1, 1]  x  [— 1, 1]  — ^ 
1R  of  the  form  ' 


(114) 

(115) 

(116) 


/(x  l,x2)  =  tp(x  1,X2)  + 


(xf  +x|)5 


v(xi,x2)  <?(x  i,x2) 
xi+*2  (x2  +  x2)  2 


(117) 


where  <p,  ip,r),  9  :  [  1, 1]  x  [-1, 1]  -4  ]R  are  smooth  functions.  Quadrature  formulae  of 
this  type  have  been  constructed,  and  the  paper  reporting  them  is  in  preparation. 


A  Existence  of  Quadrature  Formulae  for  Functions  of  the 

Form  <p(x)  +  ip(x)  •  log  |x|  +  ^1 

X  x2 

In  Section  3.2,  we  numerically  construct  quadrature  formulae  on  the  interval  [—1  1]  for 
functions  of  the  form  ’  J 


/ (®)  =  f^(x)  +  V’(x)  •  log  |x|  +  ~~~  •  (118) 

The  nodes  of  the  quadratures  we  construct  are  Gaussian  nodes  x\,X2,...,xn  with  a  suffi¬ 
ciently  large  N,  and  their  weights  are  determined  via  a  least  squares  procedure.  The  purpose 
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of  this  Appendix  is  to  prove  that  the  least  squares  process  of  Section  3.2  can  be  used  to 
obtain  quadratures  of  arbitrary  accuracy.  We  do  so  by  constructing  a  procedure  that,  given 
a  real  e  >  0  and  a  sufficiently  large  integer  N,  produces  a  set  of  weights  w\  ,w2,...,wn  such 
that,  m  combination  with  the  Gaussian  nodes  x1? x2,...,xN  evaluates  the  integral  (82)  to 
precision  e. 

Remark  A.l  The  procedure  of  this  Appendix  is  quite  inefficient,  in  the  sense  that  it  re¬ 
quires  a  very  large  number  of  nodes  to  obtain  acceptable  levels  of  accuracy;  its  purpose  is  to 
prove  that  such  quadratures  exist.  The  procedure  for  the  actual  evaluation  of  coefficients  is 
described  in  Section  3.2,  and  results  in  schemes  whose  precision  is  satisfactory  at  moderate 
values  of  N  ( see  Section  5). 

The  following  lemma  follows  immediately  from  the  definition  of  the  integral,  and  the  fact 
that  a  logarithmic  singularity  is  integrable. 

Lemma  A. 2  Suppose  that  j  >  0  is  an  integer  number,  and  that  Pj  denotes  the  j-th  Leg¬ 
endre  polynomial  (see  (17)).  Then  for  any  positive  real  number  e.  there  exists  an  integer 
Nq>\  such  that  for  any  N  >  Nq 


r  1  Ar 

y_1  pj(x)  ■  lQg  1*1  dx-  J2wi-  Pj(Xi)  ■  log  |Xj j 


<€, 


*<*o 


(119) 


with  xux2,  ...,xN  and  wuw2,  ...,wN  the  nodes  and  the  weights  of  the  N  -point  Gaussian 
quadrature  (see  Theorem  2.9). 


The  following  lemma  is  an  immediate  consequence  of  Lemma  A.2. 

Lemma  A. 3  Suppose  that  Pj  denotes  the  j-th  Legendre  polynomial  (see  (17)).  Then  for 
any  positive  real  number  e  and  integer  M  >  0,  there  exists  an  integer  N0  >  1  such  that  for 
any  N  >  Nq  and  each  j  =  0, 1, . . . ,  M 


N 

H  wi  •  Pj{xi) 
£=  1 


<  €, 


(120) 


and 


r  i  -'v 

J  ^  Pj{x)  ■  log  |x|  dx  -  E  U!t  •  Pj(Xi)  ■  log  |Xj 


<€, 


r,#0 


(121) 


with  x\,  x2, . . . , x/v-  and  W\,w2, ....  the  nodes  and  the  weights  of  the  N -point  Gaussian 
quadrature  (see  Theorem  2.9).  Furthermore,  for  any  function  F  :  [-1,  l]  ]R  0f  the  form 


M 

F(x)  =  S  {aJ  +  bj  ■  log  l*|)  •  Pj(x) , 
3= o 


(122) 
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with  cij,  bj  arbitrary  real  coefficients, 


J  J  F(x)  dx  —  E  •  F(*i) 

x{*0 


<  €• 


M 


E  (M +  ifyi)  • 

;=0 


(123) 


The  following  lemma  provides  a  formula  for  the  evaluation  of  the  integrals  of  functions 
that  are  linear  combinations  of  polynomials,  and  polynomials  composed  with  the  singular 
function  -4. 

X£ 


Lemma  A.4  Suppose  that  n  >  1  is  an  integer  number,  and  that  the  function  F  :  [-1. 1]  -» 
1R  is  defined  by  the  formula 

F(x)  =  Pn(x)  +  ^p-,  (124) 

mth  Pn'Sn  :  t-1’1]  ^  11  arbitrary  polynomials  of  degree  n.  Furthermore,  suppose  that  the 
function  f  :  [—1, 1]  — >  1R  is  defined  by  the  formula 


f(x)  =  x2-F(x). 

Then 


(125) 


f.p.  J  F(x)  dx  =  E  wi  •  (F(Xi)  -  -  2  /(0) ,  (126) 

where  wi,  w2, . . . ,  wn  and  x \,x2,...,xn  are  the  weights  and  nodes  of  the  n-point  Gaussian 
quadrature,  respectively  (see  Theorem  2.9). 

Proof.  Defining  the  function  G  :  [-1, 1]  R  by  the  formula 


G(x)  =  F(x)  -  M  -  IM  , 

X 2  x 

we  observe  that  G  is  a  polynomial  of  order  n,  and  therefore 

f  G(x)dx  =  ■£  w,  ■  (F(Xi)  -M-  IM)  . 

1  i=  l  '  xi  Zi  ) 

zz* 

Now,  observing  that 

t?-0, 


1„.  x- 

Zj^O 


(127) 


(128) 


(129) 


(due  to  the  symmetry  of  the  Gaussian  nodes  and  weights  about  zero),  and  substituting 
(129)  into  (128),  we  have  6 


f_G(x)dx  =  £  .  (ffe)  -  1|>) 


(130) 


Xi^O 
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It  immediately  follows  from  (10)  that 


and,  combining  (127),  (130),  (131),  we  obtain 


f.p.  f  F(x)dx  =  ['  G(x)  dx  +  f.p.  ['  lR±£Bl?  dx 

J- 1  x2 


(131) 

(132) 

(133) 


Lemma  A.5  Suppose  that  F  :  [-1, 1]  R  and  f  :  [-1, 1]  3R  are  two  functions  defined 

by  (124)  and  (125),  respectively.  Then  there  exists  a  positive  real  Cx  such  that  for  any 
sufficiently  small  h , 

/CO)  -  (m  +  F(-h ))  •  y  |  <Cx-h\  (134) 

Furthermore,  for  any  real  7  0  {  —  1,0, 1},  there  exists  a  positive  real  number  C2  such  that 
for  any  sufficiently  small  h, 

/( 0)  -  (FW  +  F(~h)  -  F(yh)  -  F(-ihj)  •  Yw-l)\  (135) 

Proof.  We  start  with  observing  that  for  any  F  :  [-1,1]  ->  R  defined  by  (124),  there 
exist  such  real  numbers  a_2,  a_i,  a0,  ax , . . . ,  an  that 

F(x)  =  FfT  +  ~  +  ao  +  aix  +  ...+anxn  ,  (136) 

and  due  to  (125), 

a-2  =  /(0).  (137) 

It  immediately  follows  from  (136)  that  for  small  h, 

F(h)  =  .  -jj-  +  -y-  +  ao  +  ai  h  +  a2  h2  -f  0(/i3) ,  (138) 

^  +  ao  -  a! /i  +  a2h2  +  0(/i3) .  (139) 

Adding  (138)  to  (139),  we  obtain 

F(h)  +  F(-h)  =  +  2  a0  +  2  a2  h2  +  0(/i4) ,  (140) 

and  (134)  immediately  follows  from  the  combination  of  (137)  and  (140). 
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In  order  to  prove  (135),  we  replace  h  with  7  •  h  in  (140)  above,  obtaining 

F{ 7  h)  +  F(- 7  h)  =  +  2  a0  +  2  a2  72  h2  +  0(h4) .  (141) 

Subtracting  (141)  from  (140),  we  have 

F(h)  +  F{-h)  -  (F{jh)  +  F{-jh)^  =  —  +  2q2  h2(l  -  72)  +  0{h4 ) ,  (142) 

and  (135)  immediately  follows  from  the  combination  of  (137)  and  (142).  □ 

The  following  theorem  now  immediately  follows  from  the  combination  of  Lemmas  A  3 
-  A. 5. 

Theorem  A. 6  Suppose  that  Pj  denotes  the  j-th  Legendre  polynomial  (see  (17)).  Then 
for  any  positive  real  number  e  and  integer  M  >  0,  there  exists  an  integer  N0  >  1,  real 
coefficients  wuw2,..  .,wN,  and  a  positive  constant  C  such  that  for  any  N  >  No  and  each 
j  —  0,1, ...  ,M 


rl  N 

/  PJ ( x )  dx  -  ■  Pj(xi )  <  «  , 

J~1  i= 1 

x,^0 

rl  N 

y_1  PAX)  ■  loS  1*1  dx-  Y,  ■  Pj (xt )  ■  log  |xi|  <  e , 


-1  pj(x) 


dx~  Y 


2  ^ 


Yfii  <C'Ywh  (146) 

i=l  z=l 

with  x1,x2,...,xN  and  wi,w2,...,  wN  the  nodes  and  the  weights  of  the  N  -point  Gaussian 
quadrature  (see  Theorem  2.9).  Furthermore,  for  any  function  F  :  [-1, 1]  -+  ]R  of  the  form 

M 

F(x)  =  Y  (ai  +  b3  •  log  1*1  +  %)  •  Pj(x) ,  (147) 

j= 0  x 

with  aj,  bj,  Cj  arbitrary  real  coefficients, 

rl  N  M 

L  F(x)  dx~Efit  ■  p(xt)  <*  Y  (n + n  +  m)  .  (us) 

1=i  4— n  v  y 
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xn 

^1,71 

W2.n 

Wz  ,n 

-.9862838086968123E+00 
-.9284348836635735E+00 
-.8272013150697650E+00 
-.68729290481 16855E+00 
-.5152486363581541E+00 
-.3191123689278897E+00 
-.  1080549487073437E+00 
0.1080549487073437E+00 
0.3191123689278897E+00 
0.5152486363581541E+00 
0.6872929048116855E+00 
0.8272013150697650E+00 
0.9284348836635735E+00 
0.9862838086968123E+00 

0.6158759029892887E+00 
-.3449922634155065E+01 
0.6341017949494823E+00 
-.1619416300699971E+01 
0.4959237125495822E+00 
-.  103813967941 1058E+01 
0.3511876142040904E+00 
-.6752486832724772E+00 
0.2161950693504096E+00 
-.4027047889340547E+00 

0. 1014500308386035E+00 
-.1896412777930365E+00 
0.2050334687326519E-01 
-.3560786461470516E-01 

-.1 74950758490871 7E+00 

-.2439832523477966E4-00 

-.2035606965679834E+00 

-.2159934769461259E+00 

-.1075251867819710E+00 

-.  1 196358314284132E+00 

0.1088206509124769E-01 

-.  1913486054919796E-01 

0.9038214134065220E-01 

0.4482568706166883E-01 

0.1047670461892695E-i-00 

0.56162541 15094882E-01 

0.6074345322744063E-01 

0.2130791084865406E-01 

-.1130556007318105E+03 

0.2343635742304627E+02 

0.2052970256686051E+02 

-.1376240462154258E+02 

0.1455155616946274E+02 

-.1125576720477356E+02 

0.1005717417020061E+02 

-.7774959517403008E+01 

0.6382331406035814E+01 

-.4626948764626759E+01 

0.3365725647246004E+01 

-.2045992407816295E-r01 

0.1083005671766590E+01 

-.2941688960408355E+00 

Table  1:  14-node  quadratures  of  the  form  (75)  -  (77)  for  y  =  -0.9862838086968123 
(see  Example  5.1  and  Remark  5.1). 


N 

1 

( y-x )  1 

log(|i  -  y|) 

(y-x)-' 

36 

42 

0.560E-12 

0.257E-15 

0.250E-13 
!  0.119E-14 

0.420E-13 

0.225E-15 

0.885E-15 

0.147E-13 

Table  2:  Relative  errors  of  the  quadrature  formula  (97)  applied  to  the  integrands 
(100),  (101)  -  (103)  (see  Example  5.3). 


N 

1 

(y-x)-1 

l°g(|x-y|) 

(y-x)-' 

36 

42 

100 

150 

200 

250 

300 

0.114E-14 

0.700E-15 

0.775E-15 

0.333E-15 

0.196E-14 

0.262E-14 

0.269E-14 

0.581E-02 

0.277E-02 

0.192E-05 

0.350E-08 

0.631E-11 

0.106E-13 

0.967E-15 

0.108E-04 

0.427E-05 

0.112E-08 

0.133E-11 

0.188E-14 

0.551E-15 

0.568E-15 

0.121E+00 

0.680E-01 

0.114E-03 

0.310E-06 

0.746E-09 

0.167E-11 

0.525E^14 

Table  3:  Relative  errors  of  the  standard  Gaussian  quadrature  (see  Theorem  2.9) 
applied  to  the  integrands  (100),  (101)  -  (103)  (see  Example  5.3). 
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±xn 

Wn 

0.1065589476527457E+00 

0.3113548847160309E+00 

0.4932817445063880E+00 

0.6431876254991823E+00 

0.7584402200373317E+00 

0.8418072582807350E+00 

0.8991123360358894E+00 

0.9369451420922662E+00 

0.9611808158857813E+00 

0.9763813583671749E+00 

0.9857845370872045E+00 

0.9915537349540954E+00 

0.9950772406715330E+00 

0.9972224562334544E+00 

0.9985216206191467E+00 

0.9992964931862838E+00 

0.9997376213125204E+00 

0.9999525789657767E+00 

0.2116935969670785E+00 

0.1954154182193890E+00 

0.1668941295018453E+00 

0.1325004013421312E+00 

0.9850855499442945E-01 

0.6923612105413195E-01 

0.4649700037042145E-01 

0.3015693021568984E-01 

0. 1907410671 190122E-01 
0.1186194584542522E-01 

0. 7299783922072470E-02 
0.4465717196444791E-02 
0.2722792056317777E-02 
0.1654307961017307E-02 
0.996361 1678876147E-03 
0.5843631686022078E-03 
0.3153728101867406E-03 

0. 1230964950065995E-03 

Table  4:  36-node  generalized  Gaussian  quadrature  (97)  for  functions  of  the  form 
(106)  with  M  =  11,  and  precision  10~15  (see  Example  5.3). 


±Xn 

■*,'  Wn 

0.7824400816570354E-01 

0.2317400514932991E+00 

0.3765817141635966E+00 

0.5080234535636137E+00 

0.6226938088738944E+00 

0.7188418253624399E+00 

0.7963343649196293E+00 

0.8564163016327517E+00 

0.9013001486524265E+00 

0.9336896680922276E+00 

0.9563457975135937E+00 

0.9717714213532305E+00 

0.9820411592483684E+00 

0.9887573995032291E+00 

0.9930900683346067E+00 

0. 9958561171201 172E+00 
0.9976063686147585E+00 
0.9987019026654443E+00 
0.9993734804740140E+00 
0.9997640500479557E+00 
0.9999571252163234E+00 

0.1559838796617961E+00 

0.1500543303602524E+00 

0.1388302709124357E+00 

0.1234870921831402E+00 

0.1055618635285824E+00 

0.8671614170628514E-01 

0.6848351985661966E-01 

0.5205731921370713E-01 

0.3816842653276627E-01 

0.2707608184111357E-01 

0.1865610690150748E-01 

0.1254153267525754E-01 

0.8264234965377917E-02 

0.5361830655763248E-02 

0.3438177342595994E-02 

0.2184437514815405E-02 

0.1375256690097983E-02 

0.853570634926505  IE-03 

0.5129451502696074E-03 

0.2818251084208615E-03 

0.11 1 1565642688685E-03 

Table  5:  42-node  generalized  Gaussian  quadrature  (97)  for  functions  of  the  form 
(106)  with  M  =  21,  and  precision  10~15  (see  Example  5.3). 
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Dr4 


Figure  1:  The  set  DRj. 


Figure  2:  Relative  errors  of  the  quadrature  formulae  (75)  -  (77)  with  N  =  6. 8 . 26 

applied  to  the  integrands  (101)  -  (103)  (see  Example  5.1).  The  relative  error  of  the 
N- point  Gaussian  quadratures  with  N  =  6, 8, . . . ,  26  applied  to  the  function  (100) 
are  presented  for  comparison. 


Figure  3:  Relative  errors  of  the  quadrature  formula  (85)  with  M  =  6, 8, . .  ,  24  and 
N  =  6  •  M,  applied  to  the  integrands  (101)  -  (103)  (see  Example  5.2)’. 
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K 

degree  4 

degree  6 

degree  10 

degree  12 

degree  16 

2 

0.976E+00 

0.105E+01 

0.904E+01 

0.372E+01 

0.799E+01 

4 

0.109E+01 

0.178E+01 

0.998E+01 

0.622E+01 

0.325E+01 

8 

0.157E+01 

0.226E+01 

0.429E+01 

0.239E+01 

0.188E+01 

16 

0.215E+01 

0.149E+01 

0.212E+01 

0.103E+01 

0.788E+00 

32 

0.131E+01 

0.103E+01 

0.219E+00 

0.483E-01 

0.184E-02 

64 

0.556E-f-00 

0.115E+00 

0.194E-02 

0.166E-02 

0.368E-03 

128 

0.614E-01 

0.285E-02 

0.115E-05 

0.126E-07 

0.364E-09 

256 

0.442E-02 

0.498E-04 

0.133E-08 

0.270E-09 

0.693E-09 

512 

0.280E-03 

0.778E-06 

0.837E-09 

0.476E-08 

0.384E-08 

1024 

0.165E-04 

0.125E-07 

0.150E-08 

0.149E-07 

0.147E-07 

2048 

0.102E-05 

0.271E-08 

0.171E-07 

0.293E-07 

0.532E-07 

4096 

0.635E-07 

0.231E-07 

0.613E-07 

0.921E-07 

0.128E-06 

8192 

0.110E-07 

0.113E-06 

0.300E-06 

0.134E-05 

0.705E-06 

Table  6.  Relative  errors  of  the  compound  quadrature  formula  of  degrees  M  = 
4,6.10,12,16  and  the  number  of  subintervals  K  =  2,4, ...,8192  applied  to  the 
integral  (108)  (see  Example  5.4). 
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Whenever  physical  signals  are  measured  or  generated,  the  locations  of  receivers  or  trans¬ 
ducers  have  to  be  selected.  Most  of  the  time,  this  appears  to  be  done  on  an  ad  hoc  basis. 
For  example,  when  a  string  of  geophones  is  used  in  the  measurements  of  seismic  data  in 
oil  exploration,  the  receivers  are  located  at  equispaced  points  on  an  interval.  When  phased 
array  antennae  are  constructed,  their  shapes  are  determined  by  certain  aperture  consid¬ 
erations:  round  and  rectangular  shapes  are  common.  When  antenna  beams  are  steered 
electronically,  it  is  done  by  changing  the  phases  (and  sometimes,  the  amplitudes)  of  the 
transducers.  Again,  these  transducers  are  located  in  a  region  of  predetermined  geometry, 
and  their  actual  locations  within  that  geometry  are  chosen  via  some  heuristic  procedure. 
In  all  these  (and  many  other)  cases,  the  signals  being  received  or  generated  are  band-limited . 
Optimal  representation  of  such  signals  has  been  studied  in  detail  by  Slepian  et..  al.  more 
than  30  years  ago.  and  some  of  the  obtained  results  were  applied  by  D.  Rhodes  to  the 
design  of  antenna  patterns;  further  development  of  this  line  of  research  appears  to  have 
been  hindered  by  the  absence  at  the  time  of  necessary  numerical  tools.  We  combine  these 
classical  results  with  the  recently  developed  apparatus  of  Generalized  Gaussian  Quadratures 
to  construct  optimal  nodes  for  the  measurement  and  generation  of  band-limited  signals.  In 
this  report,  we  describe  the  procedure  based  on  these  techniques  for  the  design  of  such 
receiver  (and  transducer)  configurations  in  a  variety  of  environments. 
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Measurement  and  Generation  of  Band-Limited  Signals 
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1  Introduction 


\\  hen  measurements  are  performed,  it  often  happens  that  the  signal  to  be  measured  is 
well  approximated  by  linear  combinations  of  oscillatory  exponentials,  i.e.  functions  of 
the  form 


E 


e'-b  x 


3  =  1 

in  one  dimension,  of  the  form 


n 

a-j  ■  e!'  (V*+/0  !/) 

3  =  1 

in  two  dimensions,  and  of  the  form 


(1) 


(2) 


n 

^  '  Q  ?  1~llJ  =  ) 

j=  1 

in  three  dimensions.  In  most  cases,  the  signal  is  band-limited,  i.e. 
positive  a  that  all  1  <  j  <  n. 


(3) 


there  exist  such  real 


I  3;  |<  a 

in  one  dimension. 


(3) 


\  2  2^2 

xj  ^  <  Q 

in  two  dimensions,  and 

A  j  +  /Jj  4-  v2  <  a2, 


(5) 


(6) 


in  three  dimensions. 

As  is  well-known,  most  measurements  of  electromagnetic  and  acoustic  data  (espe- 
cialh  at  leasonably  high  frequencies)  are  of  this  form.  Examples  of  such  situations 
include  geophone  and  hydrophone  strings  in  geophysics,  phased  array  antennae  in  radar 
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systems,  multiple  transceivers  in  ultrasound  imaging,  and  a  number  of  other  applications 
in  astrophysics,  medical  imaging,  non-destructive  testing,  etc. 

In  this  report,  we  describe  a  procedure  for  determining  the  optimal  distribution  of 
sources  and  receivers  that  maximizes  accuracy  and  resolution  in  measuring  band-limited 
data  given  a  fixed  number  of  receivers.  Alternatively,  the  procedure  can  be  used  to 
detennine  the  optimal  distribution  of  receivers  that  will  minimize  their  number  given 
specified  accuracy  and  resolution.  \\  hile  the  techniques  described  in  this  note  are  fairly 
geneial.  \\c  desciibe  them  in  detail  in  the  case  of  linear  antenna  arravs:  the  changes 
needed  to  generalize  the  approach  to  other  cases  are  summarized  in  Section  6. 

Remark  1.1  One  of  principal  issues  in  the  design  of  antenna  arrays  is  the  treatment 
(or  avoidance)  of  the  so-called  supergain  (or  superdirectivity).  Supergain  is  the  con¬ 
dition  that  occurs  when  an  antenna  design  is  attempted  that  is  prohibited  (or  nearly 
prohibited)  by  the  Heisenberg  principle:  technically,  it  occurs  in  the  form  of  very  closely 
spaced  elements  operating  out  of  pliaze.  and  leads  to  prohibitive  Ohmic  losses  in  trans¬ 
mitting  antennae,  loss  of  sensitivity  in  receiving  ones.  etc.  Since  the  purpose  of  this 
noi<>  is  to  introduce  techniques  for  selecting  the  locations  of  elements  for  a  prescribed 
ofifi  mia  pattern,  we  avoid  the  issue  of  choosing  the  antenna  pattern  altogether.  Instead, 
we  observe  design  optimal  element  distributions  for  several  standard  far-field  patterns 
(see  Section  5.1).  and  we  observe  that  the  scheme  for  choosing  optimal  distributions  of 
elements  is  virtually  independent  of  the  patterns  being  approximated. 

Technically,  the  approach  taken  here  is  to  observe  that  designing  an  antenna  array 
can  be  \ie\\ed  as  constructing  a  quadrature  formula  for  the  integration  of  certain  special 
classes  of  functions.  I  sing  recently  developed  techniques  for  the  construction  of  so-called 
Generalized  Gaussian  Quadratures,  we  obtain  both  nodes  and  weights  that  are  optimal 
(in  a  very  strong  sense)  for  the  required  antenna  pattern. 

The  structure  of  this  note  is  as  follows.  In  Section  2,  we  summarize  some  of  the  math¬ 
ematical  appaiatus  to  be  used:  Chebychev  Systems.  Generalized  Gaussian  Quadratures, 
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etc.  In  Section  3.  we  recapitulate  some  of  the  standard  antenna  theory,  primarily  to 
introduce  the  necessary  notation.  In  Section  4,  element  distributions  given  a  specific  an¬ 
tenna  pattern.  In  Section  5.  we  illustrate  our  approach  with  several  numerical  examples, 
and  Section  6  contains  a  discussion  of  the  generality  of  the  schemes  presented. 

2  Analytical  Preliminaries 

In  this  section,  we  summarize  several  known  facts  about  classical  Special  functions.  All 
of  these  facts  can  be  found  in  the  literature:  detailed  references  are  given  in  the  text. 

2.1  Chebyshev  systems 

Definition  2.1  .4  sequence  of  functions  will  be  referred  to  as  a  Chebyshev 

system  on  the  interval  [a.b]  if  each  of  them  is  continuous  and  the  determinant 

Oi(Xi)  pi(.l'„) 

;  ;  (~) 

|  °n  (-t'l  )  ‘  ‘  ‘  Qn(xn) 

is  nonzero  for  any  sequence  of  points  .r1; - xn  such  that  a  <  x^  <  x2 .  . .  <  xn  <  b. 

An  alternate  definition  of  a  Chebyshev  system  is  that  any  linear  combination  of  the 
functions  with  nonzero  coefficients  must  have  no  more  than  n  zeros. 

Examples  of  Chebyshev  and  extended  Chebyshev  systems  include  the  following  (ad¬ 
ditional  examples  can  be  found  in  [8]). 

Example  2.1  The  powers  1,  x,  x~, . . . .  xn  form  an  extended  Chebyshev  system  on  the 
interval  (— oc.  oo). 

Example  2.2  The  exponentials  e~XxX ,  e“A2X, . . . ,  e-^1  form  an  extended  Chebyshev  sys¬ 
tem  for  any  \u...,\n  >0  on  the  interval  [0;  oc). 

Example  2.3  The  functions  1,  cos. r,  sin  x,  cos  2x,  sin  2x, ...,  cos  nx,  sin  nx  form  a  Cheby¬ 
shev  system  on  the  interval  [0,  2tt]  . 
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Example  2.4  Suppose  that  c  >  0  is  a  real  number,  w  is  a  positive  function  [—1, 1  ]  — >  R 
such  that  w  6  c1  [ — 1, 1]  and  w(—z)  =  iv(x)  for  all  x  €  [—1, 1],  n  is  a  natural  number, 
and  the  operators  P.  Q  :  L2[— 1. 1]  — >  L2{— 1. 1]  are  defined  by  the  formulae 

P(q)(x)  =  J^w(t)  ■  e'CIt  ■  o(t)  dt  (S) 

Q  =  p-  o  P.  (g) 

Suppose  further  that  6\ .do,...  are  the  eigenfunctions  of  Q,  \i,Xo,...  are  the  corre¬ 
sponding  eigenvalues .  and  Ai  >  A2  >  A3....  Then  all  eigenfunctions  of  Q  (also  known 
as  the  right  singular  vectors  of  P )  can  be  chosen  to  be  real.  Furthermore,  the  functions 
Qi ■  0-2 . . . . .  on  constitute  a  Chebychev  system  on  the  interval  [—1. 1], 

2.2  Generalized  Gaussian  quadratures 

A  quadrature  rule  is  an  expression  of  the  form 

n 

J2  Uj  •  o(.Tj).  (IQ) 

J  =  1 

where  t lie  points  xj  G  R  and  coefficients  ug  G  R  are  referred  to  as  the  nodes  and  weights 
of  the  quadrature,  respectively.  They  serve  as  approximations  to  integrals  of  the  form 

/  o(.r)  ■  w(.v)dx  (11) 

J  U 

with  _■  is  an  integrable  non-negative  function. 

Quadratures  are  typically  chosen  so  that  the  quadrature  (10)  is  equal  to  the  desired 
integral  (11)  for  some  set  of  functions,  commonly  polynomials  of  some  fixed  order.  Of 
these,  the  classical  Gaussian  quadrature  rules  consist  of  n  nodes  and  integrate  polynomi¬ 
als  of  order  2n  —  1  exactly.  In  [13].  the  notion  of  a  Gaussian  quadrature  was  generalized 
as  follows: 

Definition  2.2  .4  quadrature  formula  will  be  referred  to  as  Gaussian  with  respect  to  a 
set  of  2n  functions  ou...,o2n  :  [a.  6]  -»  R  and  a  weight  function  u  :  [ a,b \  -*  R+,  if  it 
consists  of  n  weights  and  nodes,  and  integrates  the  functions  &  exactly  with  the  weight 
function  x  for  all  i  =  1. . . . .  2n.  The  weights  and  nodes  of  a  Gaussian  quadrature  will  be 
referred  to  as  Gaussian  weights  and  nodes  respectively. 
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The  following  theorem  appears  to  be  due  to  Markov  [15.  16];  proofs  of  it  can  also  be 
found  in  [10]  and  [S]  (in  a  somewhat  different  form). 

Theorem  2.1  Suppose  that  the  functions  6\ ,  ....<?2n  :  [a. 6]  -*  R  form  a  Chebyshev 
system  on  [a,b\.  Suppose  in  addition  that  _•  :  [a.  b]  — »  R  is  a  non-negative  integrable 
function  [a,  5]  — >  R.  Then  there  exists  a  unique  Gaussian  quadrature  for  the  functions 
O], . . . .  o-2n  on  [a.  b]  with  respect  to  the  weight  function  w.  The  weights  of  this  quadrature 
art  positive. 

Remark  2.1  While  the  existence  of  Generalized  Gaussian  Quadratures  was  observed 
more  than  100  years  ago.  the  constructions  found  in  [15.  16].  [3,  10].  [7,  S]  do  not  easily 
yield  numerical  algorithms  for  the  design  of  such  quadrature  formulae;  such  algorithms 
have  been  constructed  recently  (see  [13.  2S.  2]).  The  version  of  the  procedure  found  in 
"'as  used  to  produce  the  results  presented  in  the  Examples  5.1.  5.2.  5.3  in  Section  5.1: 
the  reader  is  referred  to  [2]  for  details. 

Applying  Theorem  2.1  to  the  Example  2.4.  we  obtain  the  following  theorem. 

Theorem  2.2  Suppose  that  under  the  conditions  of  Example  2.4 ,  n  is  even.  Then 
theie  c.i ist  n/2  points  t\.  to. .  • . .  tn/2  on  the  interval  [—1.1]  and  positive  real  numbers 
w i.  w 2 U',,/2  such  that 

n/2 

w{t)  ■  Qi(t)  dt  =  Y,  Wj  ■  Oi(tj)>  (12) 

J=1 

for  all  2  =  1.  2. . . . ,  7i,  with  Pi:  <?2:  •  •  • ,  d>n  the  first  n  eigenfunctions  of  the  operator  Q 
defined  in  (9). 

Corollary  2.3  The  above  theorem  provides  a  tool  for  the  efficient  approximate  evalua¬ 
tion  of  integrals  of  the  form  (12),  as  follows.  Given  a  positive  real  e.  we  construct  the 


o 


Singular  Value  Decomposition  of  the  operator  P  defined  in  (S).  Choosing  n  to  be  the 
smallest  even  integer  such  that 


£  A?  <  ^ 

j=n-r  1 


(13) 


we  construct  an  n /2-point  quadrature  that  integrates  n  first  right  singular  functions  ex¬ 
actly  (effective  numerical  schemes  for  the  construction  of  such  quadratures  can  be  found 
in  [13.  28,  2]).  how.  we  observe  that  due  to  the  triangle  inequality  combined  with  the 
positivity  of  the  obtained  weights  uq.  uq. ....  wnf2: 


n/2 


for  any  :r 


I  H  K*  ■  e 

:= i 
■1.1]. 


i  c  x-t , 


■  J  w{. r)  •  ei  cx  t 

-l 


dt  I  <  e 


(1-1) 


Remark  2.2  The  principal  subject  of  this  note  is  the  fact  that  the  pattern  of  an  antenna 
array  is  formed  by  a  physical  process  amounting  to  a  hardware  implementation  of  a 
quadrature  formula  for  functions  of  the  form  (9).  Thus,  designing  a  configuration  of 
elements  for  such  an  antenna  is  equivalent  to  constructing  a  quadrature  formula  for 
functions  of  the  form(  9).  and  can  be  achieved  via  the  techniques  described  in  [13.  2S.  2]). 


3  Elements  of  Antenna  Theory 

In  this  section,  we  summarize  certain  facts  about  the  theory  of  linear  antenna  arrays;  all 
of  these  facts  are  well-known,  and  can  be  found,  for  example,  in  [9]. 

3.1  Pattern  of  a  linear  array 

A  source  distribution  a  on  the  interval  [-1, 1]  creates  the  far-field  pattern  /  :  [0,tr]  -»  C 
given  by  the  formula 

i 

f{6)  =  j  a(u)  ■  ei  k  u  cos{9)  du,  (15) 
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where  k  is  the  free-space  wavenumber,  u  is  the  point  on  the  interval  [—1.1].  and  9  is  the 
angle  between  the  point  on  the  horizon  where  the  far  field  is  being  evaluated  and  the 
x-axis.  It  is  customary  to  introduce  the  notation 

x  =  cos{6),  (16) 

and  define  the  function  F  :  [-1. 1]  G  by  the  formula 

F{x)  =  f(acos(x)).  (17) 

Xow;  defining  the  operator  A  :  L2{- 1. 1]  ->  L2[-l,  1]  by  the  formula 

1 

.4(a)(.r)  =  J  a(u)  ■  el  k'u  z  du .  (IS) 

-1 

we  observe  that 

i 

F  =  A{a)  =  J  a(u)  ■  evk'u  x  du.  (19) 

-1 

The  function  F  is  usually  more  convenient  to  work  with  than  /.  and  the  following  obvious 
lemma  is  the  principal  reason  for  this  difference. 

Lemma  3.1  Suppose  that  a  €  L2[-l.  1],  the  function  F  e  L2[- 1. 1]  is  defined  by  (19), 
o  is  a  real  number,  and  the  function  a  G  L2[-l,  1]  is  defined  by  the  formula 

d{u)  =  ei  Q  U  -a(u).  (20) 

Then 

.4(a)(x)  =.4(a)(x-a)  (21) 

for  all  x  e  (-oo,  oo).  In  other  words,  in  order  to  translate  the  antenna  pattern  F  ( viewed 
as  a  function  of  x  =  cos(9)  )  by  a,  one  has  to  inultiply  by  el'a  k  the  source  distribution  a 
generating  the  pattern  F. 
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Observation  3.1  While  the  obvious  physical  considerations  lead  to  the  antenna  pattern 
F  defined  on  the  interval  [—1. 1]  ,  the  formulae  (15),  (17)  also  define  naturally  the  exten¬ 
sion  of  F  to  the  function  R  -4  C/  in  a  mild  abuse  of  notation,  we  will  be  denoting  by  F 
both  the  original  mapping  [-1. 1]  -4  C  and  its  extension  to  the  mapping  R  -4  C.  Simi¬ 
larly,  ice  will  be  denoting  by  .4  both  the  operator  L2[- 1,  1]  -4  L2[- 1. 1]  defined  by  (IS) 
and  its  natural  extension  mapping  L2[- 1. 1]  -4  c,3C(R).  The  restriction  of  F  on  R\f-1. 1] 
is  refer  red  to  as  the  invisible  spectrum  of  the  source  distribution  o  and  plays  an  important 
role  in  the  antenna  theory  (this  role  is  discussed  briefly  in  the  following  subsection).  By 
the  same  token,  the  restriction  of  F  on  the  interval  [-1.1]  is  referred  to  as  the  visible 
spectrum. 


When  an  antenna  array  is  implemented  in  hardware,  it  is  (usually)  constructed  of 
a  finite  collection  of  elements,  as  opposed  to  being  a  continuous  source  distribution. 
Mathematically,  it  is  equivalent  to  replacing  the  general  function  a  in  (15).  (19)  with  o 
defined  by  the  expression 

n 

(22) 

j= 1 

with  Oj.cq - ,  on  the  source  distributions  generated  by  individual  elements,  and  the 

coefficients  Ji ,  Jo, . . . ,  J„  the  intensities  of  the  elements.  As  a  rule,  the  elements  are 
localized  in  space  (i.e.  the  functions  O1.O0.....O,,  are  supported  on  small  subintcrvals 
^  WL).  and  \eiy  often,  all  of  the  elements  are  identical  (i.e.  the  functions  Oj  are 
translates  of  each  other),  so  that 


6 flu)  =  o{u  -uj), 


(23) 


with  o  the  source  distribution  of  a  single  element  located  at  the  point  u  =  0,  and  uj  the 
location  of  the  element  number  j.  Obviously,  the  far-field  pattern  of  J  is  given  by  the 
formula 


1 

F0(x)  =  J  q(u)  •  el'k  u'x  du: 

-1 


(24) 
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(25) 


combining  (24)  with  (22)  and  (23),  we  obtain  the  identity 

r  - 

a(x)  -  /  o(u)  ■  evk'u  x  du  •  £  3j  ■  e'k'Uj'x, 

-i  j= l 

known  in  the  antenna  theory  as  the  principle  of  pattern  multiplication. 

Remark  3.2  The  standard  form  of  the  principle  of  multiplication  reads:  "The  field 
pattern  of  an  array  of  nonisotropic  but  similar  point  sources  is  the  product  of  the  pattern 
of  the  individual  source  and  the  the  pattern  of  an  array  of  isotropic  point  sources,  having 
the  same  locations,  relative  amplitudes  and  phases  as  the  nonisotropic  point  sources"  (see 
!9j).  Needless  to  say.  this  is  a  special  case  of  the  well-known  theorem  from  the  theory  of 
rhe  Fourier  Transform,  stating  that  the  Fourier  transform  of  the  product  of  two  functions 
is  the  convolution  of  the  Fourier  Transforms  of  multiplicants. 

4  Antenna  Patterns  and  Corresponding  Optimal  El¬ 
ement  Distributions 

4.1  Characteristics  of  an  antenna  pattern 

Depending  on  the  situation,  the  design  of  an  antenna  array  attempts  to  optimize  certain 
c.haiacteiistics  of  the  resulting  far-field  pattern,  subject  to  certain  constraints  on  the 
number,  power,  etc.  of  the  elements.  Since  the  principal  purpose  of  this  note  is  to 
describe  a  technique  for  the  selection  of  the  locations  of  the  elements  that  approximate  a 
user-specified  pattern,  we  could  use  any  reasonable  far-field  pattern  to  be  approximated. 
In  subsection  4.2,  4.3,  we  construct  optimal  element  distributions  for  the  so-called  sector 
patterns  and  cosecant  pattern,  respectively;  a  detailed  discussion  of  these  (and  several 
other)  pattern  cans  be  found,  for  example  in  [14]. 

We  will  say  that  the  antenna  pattern  has  the  e-bandwidth  b  if 

l 

/  |F(x)|2  dx  =  e2  •  f  |F(x)|2  dx  (26) 

fc<IPII<i  -l 
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in  other  words,  the  proportion  of  the  energy  radiated  outside  the  e-beamwidth  from  the 
axis  of  the  beam  is  equal  to  e.  The  supergain  of  an  antenna  is  defined  (see.  for  example. 
[27]),  as  the  ratio 

7V(x)|2  dx 

“DC 

T - ■  .  (27) 

/  |F(i)P  dx 
“1 

The  supergain  (sometimes  referred  to  as  superdirectivity)  measures  the  ratio  of  the  en¬ 
ergy  associated  with  the  total  spectrum  of  the  antenna  to  the  energy  in  its  visible  spec¬ 
trum:  while  detailed  discussion  of  supergain  and  related  issues  is  outside  the  scope  of  this 
note,  we  will  observe  that  antenna  arrays  with  large  degrees  of  supergain  would  violate 
the  uncertainty  principle,  and  thus  are  physically  impossible.  Attempts  to  construct 
supergain  antennae  result  in  rapidly  (exponentially)  growing  Ohmic  losses,  prohibitive 
accuracy  requirements,  extremely  low  bandwidth,  etc.  Thus,  any  potentially  useful  pro¬ 
cedure  lor  the  design  of  antenna  arrays  has  to  limit  the  supergain  of  the  resulting  patterns. 

4.2  Sector  patterns 

It  is  often  desirable  to  construct  antenna  patterns  that  are  as  constant  as  possible  within 
the  main  beam,  and  as  small  as  possible  outside  it:  in  other  words,  ideally,  the  pattern 
would  be  defined  by  the  formulae 


h(. r)  =  1 

for  |. 

-O 

VI 

(28) 

Fu(x)  =  0 

for  |. 

t\  >  b, 

(29) 

with  b  a  real  number  such  that  0  <  b  <  k.  Needless  to  say,  the  function  Fb  defined  by 
the  formulae  (2S),  (29)  is  not  band-limited,  and  some  approximation  has  to  be  used.  A 
standard  procedure  is  to  truncate  the  Fourier  Transform  of  Fb,  approximating  it  bv  the 


function  Fb  defined  by  the  formula 


nw  =  [' 

2-i  t 


eikzt 


(30) 
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(see.  for  example.  [26]).  An  important  special  case  occurs  when  b  =  k.  with  (30)  assuming 


the  form 


A(*)  = 


rl  sin(k-t) 


• k-xt . 


(31) 


/-i  t 

obviously,  the  latter  expression  is  a  band-limited  approximation  of  the  d-function.  An¬ 
other  frequently  encountered  situation  is  that  of  b  =  k/2,  so  that  (30)  assumes  the  form 

rl  sin(k  ■  t) 


Fk(x)  = 


J'k-x-t 


t 


(32) 


which  is  a  band-limited  approximation  to  the  beam  that  is  equal  to  1  for  —1/2  <  x  <  1/2 
and  to  zero  elsewhere. 

In  Section  4.4  below,  we  demonstrate  optimal  element  configurations  that  produce 
approximations  to  the  patterns  (31).  (32)  with  k  =  20~.  IOtt.  32.4676". 


Remark  4.1  \\  hile  (30)  is  by  no  means  the  only  possible  band-limited  approximations 
to  to  Fj,.  it  is  quite  satisfactory  in  most  cases,  in  addition  to  being  simple.  Furthermore, 
tin'  principal  purpose  of  this  note  is  to  describe  a  technique  for  the  selection  of  locations 
of  the  nodes,  given  a  pattern  to  be  approximated.  Thus,  we  ignore  the  issue  of  the 
optimal  choice  of  Fb. 

4.3  Cosecant  patterns 

Another  standard  far-field  radiation  pattern  is  the  so-called  cosecant  pattern  (see,  for 
example.  [19]).  Given  two  real  numbers  0  <  a  <  b  <  1,  the  cosecant  pattern  Fa>b  is 
defined  by  the  formula 

FaM  =  \  (33) 

for  all  x  £  [a,  b],  and 

Fa.bix)  =  0  (34) 
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for  all  x  €  ([-1, 1]  \  [a,  6]).  Again,  the  function  Fa^  defined  by  the  formulae  (33),  (34)  is 
not  band-limited,  and  can  not  be  represented  by  the  expression  of  the  form  (24).  Before 
the  scheme  of  this  note  can  be  applied  to  Fa^.  the  latter  has  to  be  approximated  with  a 
band-limited  function:  as  discussed  in  Section  4.1  above,  if  such  an  approximation  is  to 
be  useful  as  an  antenna  pattern,  its  supergain  factor  has  to  be  controlled.  Fortunatelv. 
a  procedure  for  such  an  approximation  has  been  in  existence  for  more  than  35  years 
(see,  [IS]);  the  algorithm  of  [IS]  is  a  modification  of  the  least-squares  approach  permitting 
the  user  to  limit  the  supergain  factor  of  the  obtained  pattern  explicitly.  At  the  time,  the 
utility  of  the  scheme  of  [IS]  was  limited  by  the  (perceived)  difficulty  in  the  numerical 
evaluation  of  Prolate  Spheroidal  Wave  functions:  given  the  present  state  of  numerical 
analysis,  this  difficulty  is  non-existent,  and  it  is  this  author's  impression  that  the  insights 
of  [IS],  [19]  deserve  more  attention  than  they  have  been  receiving. 

4.4  Optimal  distributions  of  elements 

In  this  subsection,  we  briefly  describe  an  algorithm  for  the  construction  of  optimal  (in 
the  sense  defined  below)  element  configurations  for  the  generation  of  antenna  patterns 
given  by  (15).  of  which  the  patterns  (29)- (31 )  are  special  cases.  As  will  be  seen,  the 
piocedure  is  in  fact  applicable  to  the  design  of  element  configurations  for  verv  general 
far-field  patterns. 

We  start  with  observing  that  (15)  expresses  the  far-field  pattern  F  as  an  integral  over 
the  interval  [-1. 1]  of  functions  of  the  form 

■  el  k  x  u ,  (35) 

with  x  =  cos{9)  determined  by  the  direction  9  in  which  the  far-field  is  being  evaluated.  In 
other  words,  the  problem  of  finding  efficient  antenna  element  distributions  is  equivalent 
to  that  of  constructing  quadrature  formulae  for  integrals  of  the  form  (8),  with 

«■'(*)  =  (36) 
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In  the  cases  when  a  is  non-negative  everywhere  on  the  interval  [—1.1],  Theorem  2.2 
guarantees  the  existence  of  Generalized  Gaussian  Quadratures,  and  [13.  2S])  provide  a 
satisfactory  numerical  apparatus  for  the  construction  of  such  quadratures.  Obviously,  the 
patterns  given  by  the  formula  (28)  are  not  generated  by  non-negative  source  distributions, 
except  when 

h  <  "•  .  (37) 

Thus,  for  these  (and  many  other)  patterns,  the  conditions  of  Theorem  2.2  are  violated, 
and  the  existence  of  Generalized  Gaussian  Quadratures  is  not  guaranteed.  In  our  numer¬ 
ical  experiments,  the  techniques  of  [2])  (after  some  tuning)  have  always  been  successful 
in  finding  the  Gaussian  quadratures  for  integrals  of  the  form  (2S):  some  of  our  results 
arc  presented  in  Section  5  below. 

5  Numerical  Examples 

In  this  section,  we  present  examples  of  optimal  element  distributions  generating  the 
patterns  of  the  preceding  Section:  all  of  the  results  presented  here  have  been  obtained 
numerically.  Antenna  patterns  we  present  are  compared  to  the  antenna  patterns  given 
by  uniform  source  distributions;  configurations  of  elements  approximating  these  antenna 
patterns  are  compared  to  equispaced  distributions  of  elements  generating  the  same  an¬ 
tenna  patterns. 

5.1  Optimal  distributions  of  elements 

In  this  section,  we  demonstrate  the  results  of  the  application  of  the  techniques  of  Sec¬ 
tion  4.4  of  this  note  to  the  types  of  antenna  patterns  described  in  the  Sections  4.2,  4.3. 

In  all  cases,  we  choose  the  size  of  an  antenna  array  and  a  pattern  to  be  reproduced,  and 
use  the  scheme  outlined  in  Section  4.4  to  design  a  distribution  of  antenna  elements  (both 
the  locations  and  the  intensities)  located  within  the  chosen  array  that  reproduces  the 
required  pattern.  For  comparison,  we  also  generate  optimal  (in  the  least  squares  sense) 
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approximations  to  the  desired  pattern  generated  by  equispaced  elements  located  within 
the  same  array.  Since  the  number  of  equispaced  nodes  required  to  obtain  a  reasonable 
approximation  to  the  desired  pattern  is  (in  many  cases)  much  greater  than  the  number  of 
optimally  chosen  nodes,  for  each  example  we  demonstrate  patterns  generated  by  several 
such  configurations.  In  this  manner,  the  numbers  of  optimally  chosen  nodes  necessary 
to  obtain  reasonable  approximations  to  the  desired  patterns  can  be  compared  to  the 
numbers  of  equispaced  nodes  required  to  obtain  similar  results. 

5.1.1  Sector  patterns 

Example  5.1  The  first  example  we  consider  is  of  the  pattern  defined  by  the  formula  (32). 
with  I :  =  02. S3 12.  so  that  the  size  of  the  array  is  20  wavelengths. 

In  Figure  5,  we  display  an  approximation  to  the  pattern  obtained  with  19  elements, 
occrlayed  with  the  exact  pattern:  the  locations  of  the  elements  are  displayed  in  Figure  5a: 
the  relative  error  of  the  obtained  approximation  is  5 .0 1 9c . 

Similarly,  in  Figure  5g,  we  display  the  approximation  to  the  pattern  obtained  with  21 
(dements,  overlayed  with  the  exact  pattern:  the  relative  error  of  the  obtained  approxima¬ 
tion  /.<  0.4439I  .'  in  Figure  5h,  we  display  the  the  approximation  obtained  with  17  elements. 
In  the  latter  case,  the  relative  error  of  the  obtained  approximation  is  6.43%;  Figure  5i 
depicts  the  1 1 -node  distribution  producing  the  approximation  illustrated  in  Figure  oh. 
Finally,  Figure  5j  contains  a  graph  of  the  values  of  the  sources  located  at  the  17  nodes 
depicted  in  Figure  5i  and  generating  the  pattern  shown  in  Figure  oh. 

For  comparison,  the  optimal  approximation  obtained  with  19,  24,  29,  31,  and  34 
equ it> paced  elements  are  displayed  in  Figures  5b,  5c,  5d ,  5e,  of,  respectively,'  these  are 
also  overlayed  with  the  exact  pattern. 

Example  5.2  Our  second  example  is  identical  to  the  first  one,  with  the  exception  that 
h  =  31.416,  so  that  the  size  of  the  array  is  10  wavelengths. 

In  Figure  6,  we  display  an  approximation  to  the  pattern  obtained  with  9  elements, 
overlayed  with  the  exact  pattern ;  the  locations  of  the  elements  are  displayed  in  Figure  6a; 
the  relative  error  of  the  obtained  approximation  is  11.2%. 
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Similarly,  in  Figure  6f,  we  display  the  approximation  to  the  pattern  obtained  with  11 
elements,  overlayed  with  the  exact  pattern;  the  relative  error  of  the  obtained  approxima¬ 
tion  is  0.600%. 

For  comparison,  the  optimal  approximation  obtained  with  9.  14.  16.  and  IS  equispaced 
elements  are  displayed  in  Figures  Cb,  6c,  6d.  5e,  respectively:  .these  are  also  overlayed 
with  the  exact  pattern. 

Example  5.3  Our  third  example  is  identical  to  the  preceding  two,  with  the  exception 
that  />•  =  102.  so  that  the  size  of  the  array  is  about  32.45  ivavelengths. 

In  Figure  la,  we  display  an  approximation  to  the  pattern  obtained  with  23  optimally 
distributed  elements,  overlayed  with  the  exact  pattern  and  with  the  pattern  obtained  with 
23  equispaced  elements. 

The  relative  error  of  the  obtained  approximation  is  5.4%;  needless  to  say,  the  error  of 
the  approximation  obtained  with  the  equispaced  nodes  is  more  than  70%.  .4s  can  be  seen 
from  Figure  ?c.  the  actual  size  of  the  obtained  23-element  array  is  about  21  wavelengths: 
in  other  words,  in  order  to  obtain  this  precision,  the  array  needs  to  be  about  2/3  of  the 
nominal  (maximum  permitted)  length. 

In  Figure  lb,  we  display  the  approximation  to  the  pattern  obtained  with  42  and  4S 
elements,  overlayed  with  the  exact  pattern. 

It  is  woith  noting  that  with  33  optimally  distributed  elements,  the  pattern  is  approxi¬ 
mated  to  the  precision  0.12%/  we  do  not  display  the  obtained  pattern  since  it  is  visually 
indistinguishable  from  the  pattern  being  approximated. 

Example  5.4  Our  final  example  is  somewhat  different  from  the  preceding  ones,  in  that 
instead  of  approximating  a  sector  pattern,  we  approximate  a  cosecant  pattern  ( see  (33),  (34) 
in  Subsection  4-3  above). 

In  this  example,  we  set 

a  =  sin{  15°),  (38) 
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b  =  sin(7b°), 


(39) 


and  use  the  procedure  of  [18]  to  approximate  FQ.6  with  a  band-limited  function.  The  band- 
limit  has  been  more  or  less  arbitrarily  set  to  110,  resulting  in  an  antenna  array  about  35 
wavelengths  in  size,  and  the  supergain  factor  of  the  approximation  was  set  to  1.1. 

In  Figure  8a,  we  display  an  approximation  to  the  pattern  obtained  with  53  optimally 
distributed  elements,  overlay ed  with  the  exact  bandlimited  pattern  and  with  the  pattern 
obtained  with  53  equispaced  elements. 

The  relative  error  of  the  obtained  approximation  is  1.79%  :  the  error  of  the  approxi¬ 
mation  obtained  with  the  equispaced  nodes  is  about  42%. 

In  Figure  8b.  we  display  the  approximation  to  the  pattern  obtained  with  47  optimally 
distributed  elements,  occrluyed  with  the  exact  pattern;  the  purpose  of  this  final  figure  is 
to  demonstrate  the  behavior  of  the  scheme  when  the  number  of  elements  is  insufficient 
(i.e.  when  the  array  is  underresolved). 

It  is  worth  noting  that  it  takes  about  70  equispaced  nodes  to  obtain  the  resolution 
obtained  with  47  optimally  chosen  ones. 

Tilt'  following  observations  can  be  made  from  Figures  5  -  Sb.  and  from  the  more 
detailed  numerical  experiments  performed  by  the  author. 

1.  In  order  to  obtain  reasonable  precision,  the  scheme  requires  about  1  point  per  wave¬ 
length  in  the  antenna  array:  this  is  more  or  less  independent  from  the  structure  of  the 
beam  as  long  as  the  pattern  is  symmetric  about  the  point  x  =  0.  This  fact  is  observed 
numerically,  even  for  modest  numbers  of  nodes:  for  large-scale  arrays,  this  statement 
(interpreted  asymptotically)  can  be  proved  rigorously.  For  certain  beam  structures,  the 
required  number  of  nodes  is  even  less  (see  Example  5.3).  The  reasons  for  these  additional 
savings  are  subtle,  and  have  to  do  with  the  fact  that  the  continuous  source  distribution 
generating  the  pattern  is  relatively  small  on  a  large  part  of  the  antenna  array;  the  al¬ 
gorithm  of  [2]  takes  advantage  of  this  fact  to  reduce  the  number  of  nodes.  When  the 
beam  is  not  symmetric  about  x  =  0,  the  number  of  elements  required  does  depend  on 
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the  structure  of  the  pattern,  and  the  dependence  is  fairly  complicated.  Generally,  the 
improvement  for  non-symmetric  beams  is  less  than  that  for  the  symmetric  ones. 

2.  The  qualiative  behavior  of  the  scheme  is  similar  to  that  of  the  Gaussian  quadratures 
in  that  it  displays  no  convergence  at  all  until  a  certain  minimum  number  of  nodes  is 
achieved;  after  that,  the  convergence  is  very  fast.  This  behavior  is  not  surprising,  since 
the  scheme  is  based  on  a  Generalized  Gaussian  quadrature. 

3.  For  the  sector  pattern  with  the  sector  [—1/2. 1/2],  the  scheme  reduces  the  required 
number  of  nodes  by  a  factor  of  about  1.5  for  small-scale  problems,  and  roughly  by  a 
factor  of  2  for  large-scale  ones;  again,  for  large-scale  problems,  an  asymptotic  version  of 
this  statement  can  be  proven  rigorously. 

-4.  For  the  cosecant  pattern  with  the  parameters  specified  by  (3S).  (39),  the  number 
of  nodes  required  is  reduced  by  approximately  a  factor  of  1.4.  As  the  sidelobe  level  is 
reduced,  the  improvement  obtained  by  going  from  the  equispaced  discretization  to  the 
optimal  one  increases  rapidly. 

5.  An  examination  of  Figures  5a.  Ga  shows  that  while  the  optimal  nodes  are  bv  no  means 
uniform,  they  display  no  clustering  behavior. 

G.  An  examination  of  Figure  5j  shows  that  the  intensities  of  individual  elements  do  not 
become  laige.  this  is  confirmed  by  the  more  extensive  numerical  experiments  performed 
by  the  author. 

i.  The  combination  of  the  preceding  two  paragraphs  (combined  with  additional  numer¬ 
ical  experiments  and  analysis)  provide  evidence  that  configurations  of  this  type  should 
pose  no  supergain  problems. 

6  Generalizations 

The  lesults  described  above  admit  radical  generalizations  in  several  directions;  several 
such  directions  are  discussed  below, 
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1.  Conformal  one-dimensional  arrays.  The  extension  of  the  techniques  of  this  note 
to  one-dimensional  arrays  located  on  curves  in  R 3  is  completely  straightforward,  involving 
only  a  modest  increase  of  the  CPU  time  requirements  of  the  procedure.  Improvement  in 
the  number  of  nodes  required  to  produce  a  prescribed  pattern  is  similar  to  that  in  the 
case  of  a  linear  array. 

2.  Planar  two-dimensional  arrays.  A  straightforward  generalization  of  the  results  of 
Sections  4.  5.  is  to  rectangular  planar  arrays.  Here,  a  tensor  product  quadrature  can  be 
constructed  from  the  quadratures  of  Sections  4.  5.  possessing  all  of  the  desirable  prop¬ 
erties  of  the  latter.  Obviously,  the  advantage  in  the  number  of  transducers  is  squared, 
so  that  (for  example)  replacing  50  nodes  in  each  of  the  two  directions  by  23  nodes  (see 
Example  5.3  above)  will  lead  to  a  factor  of  (50/23)-  -  4.7  savings  in  the  number  of 
elements. 

The  theory  of  Section  4  has  been  extended  for  disk-shaped  arrays,  via  ( inter  alia)  the 
techniques  developed  in  [23].  The  improvement  in  the  number  of  nodes  is  comparable  to 
that  obtained  in  the  rectangular  geometry,  and  the  CPL  time  requirements  do  not  differ 
appreciably  from  those  in  the  case  of  linear  one-dimensional  arrays. 

The  extension  of  the  theory  to  more  general  geometries  in  the  plane  is  in  progress.  At 
the  piesent  time,  our  only  numerical  experiments  have  been  with  arravs  on  triangles:  the 
results  are  encouraging,  but  the  CPU  time  requirements  of  the  algorithms  are  excessive 
(we  have  only  been  able  to  design  triangular  arrays  about  6  wavelengths  in  size).  We 
are  now  in  the  process  of  constructing  a  more  efficient  numerical  procedure  for  such 
computations. 

3.  Conformal  two-dimensional  arrays.  The  only  environment  in  which  we  have 
a  satisfactory  theory  is  when  the  array  is  located  on  a  surface  of  revolution;  even  in 
this  environment,  no  experiments  have  been  performed.  We  have  not  investigated  more 
general  conformal  two-dimensional  arrays  in  sufficient  detail. 
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Figure  5e:  The  optimal  approximation  to  the  sector  pattern  generated  by  31 
equispaced  nodes,  as  described  in  Example  5.1 


-1  -O.S  -O.G  -0.4  -0.2  0  0.2  0.4  O.G  O.S  1 


1-igurc  Gb:  The  optimal  approximation  to  the  sector  pattern  generated  by  9 
cquispacod  nodes,  as  described  in  Example  5.2 


Figure  6c:  The  optimal  approximation  to  the  sector  pattern  generated  by  14 
equispaced  nodes,  as  described  in  Example  5.2 


Figure  7a:  The  approximation  to  the  sector  pattern  generated  by  23  optimal 
elements,  vs.  optimal  approximation  by  23  equispaced  nodes,  as  described  in 

Example  5.3 


Figure  7b:  The  optimal  approximations  to  the  sector  pattern  generated  by  42 
and  48  equispaced  nodes,  as  described  in  Example  5.3 
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Figure  Sa:  The  approximation  to  the  cosecant  pattern  generated  by  53 
optimal  elements,  vs.  optimal  approximation  by  53  equispaced  nodes,  as 
described  in  Example  5.4 


Figure  Sa:  The  approximation  to  the  cosecant  pattern  generated  by  47 
optimal  elements,  as  described  in  Example  5.4 
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We  present  a  modification  of  the  Fast  Multipole  Method  (FMM)  in  two  dimensions.  While 
previous  implementations  of  the  FMM  have  been  designed  for  harmonic  kernels,  our  algo¬ 
rithm  works  for  a  large  class  of  kernels  that  satisfy  fairly  general  conditions,  amounting  to 
the  kernel  being  sufficiently  smooth  away  from  the  diagonal.  Our  algorithm  approximates 
appropriately  chosen  parts  of  the  kernel  with  “tensor  products”  of  Legendre  expansions  and 
uses  the  Singular  Value  Decomposition  (SVD)  to  compress  the  resulting  representations. 
The  obtained  singular  function  expansions  replace  the  Taylor  and  Laurent  expansions  used 
in  the  original  FMM.  The  algorithm  requires  0{N )  operations,  and  is  stable  and  robust. The 
performance  of  the  algorithm  is  illustrated  with  numerical  examples. 
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1  Introduction 


In  this  paper,  we  describe  a  fast  algorithm  for  the  evaluation  of  all  pairwise  interactions  in 
large  ensembles  of  particles  in  the  plane,  i.e.,  sums  of  the  form 

N 

u(xi)  =  Y^QjK(xi,xj),  (1) 

j= i 

where  9!,..^^  are  arbitrary  complex  numbers,  xu...,xn  are  points  in  the  plane,  and 
K  :  R  R  is  a  non-oscillatory  kernel.  Such  computations  appear  in  a  variety  of  numerical 
methods  for  the  solution  of  problems  of  computational  physics. 

The  algorithm  of  this  paper  is  a  version  of  the  Fast  Multipole  Method  (FMM)  in  two 
dimensions.  The  structure  of  the  FMM  algorithm  is  left  virtually  unchanged  from  the  one 
described  by  in  [3].  The  version  of  the  FMM  algorithm  used  in  this  paper,  however,  replaces 
the  Taylor  and  Laurent  expansions  with  “tensor  products”  of  Legendre  expansions  that  are 
subsequently  compressed  via  the  Singular  Value  Decomposition  (SVD).  This  approach  leads 
to  an  algorithm  that  can  be  applied  to  a  variety  of  non-oscillatory  kernels  that  are  sufficiently 
smooth  away  from  the  diagonal. 

In  two  dimensions,  the  original  Fast  Multipole  Method  (FMM)  relies  on  the  Taylor  and 
Laurent  expansions  (see  [14],  [7])  for  the  evaluation  of  Coulomb  interactions  in  large  ensem¬ 
bles  of  particles.  During  the  last  decade,  several  improvements  of  the  original  scheme  have 
been  suggested.  A  new  version  of  the  FMM,  based  on  specially  designed  singular  function 
expansions,  was  introduced  in  [10].  The  approach  taken  in  the  latter  paper,  when  used  in 
combination  with  an  intermediate  representation  consisting  of  complex  exponentials,  leads 
to  an  algorithm  that  is  about  five  times  as  fast  as  the  original  FMM,  due  to  the  reduction 
of  the  number  of  parameters  needed  to  represent  far  and  near  fields.  A  similar  technique 
was  used  m  one  dimension  in  [18],  A  version  of  the  FMM  for  polynomial  interpolation 
(see  [5jj  uses  Chebyshev  expansions  that  are  compressed  by  a  suitable  change  of  basis  ob- 
tamed  via  Singular  Value  Decomposition  (SVD).  Finally,  an  analytical  apparatus  based  on 
least  squares  approximation  of  integral  operators  was  developed  in  [17].  This  analytical 
apparatus  leads  to  fast  algorithms  for  a  fairly  large  class  of  kernels  in  one  dimension. 

The  plan  of  the  paper  is  as  follows.  In  Section  2,  we  introduce  mathematical  and 
numerical  preliminaries.  In  Sections  3  and  4,  we  describe  a  generalized  Fast  Multipole 
Method  in  two  dimensions  and  present  the  complexity  analysis.  Finally,  in  Section  5  we 
demonstrate  the  performance  of  the  algorithm  with  several  numerical  examples. 

2  Mathematical  Preliminaries 

2.1  Gaussian  Integration  and  Interpolation 

In  what  follows,  we  will  denote  by  P^b  the  n-th  Legendre  polynomial  on  the  interval  [a  b]  C 
R.  We  wiU  refer  to  the  roots  s?'6, . . . ,  of  P^(x)  as  the  Gaussian  nodes  of  order  n  and 
will  denote  by  the  weights  of  the  corresponding  Gaussian  quadrature  on  the 

interval  [a,  &].  We  will  denote  by  Ln  the  projection  from  the  space  of  continuous  functions 
on  the  interval  [a,  6]  to  the  space  of  polynomials  of  order  n,  preserving  the  function  values  at 
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the  Gaussian  nodes.  For  a  given  continuous  function  /  :  [a,  6]  -4  C,  the  function  Lnf(x)  is 
the  polynomial  of  order  n  such  that  Lnf(xtf)  =  f(xaJ).  As  is  well  known,  for  all  x  e  [a,  6], 

W(*)  =  £afc-P*a’6(*),  (2) 

k=0 

and  the  coefficients  a *  are  given  by  the  formula 


Qk  =  S 


w, 


a,  b 


/(*£*) 


fttf). 


m=l 


(3) 


The  polynomial  Lnf  will  be  referred  to  as  n-th  order  Legendre  expansion  of  the  function 
/.  For  any  integer  n  we  will  denote  by  ||Ln||oo  the  L°°-norm  of  the  operator  Ln,  defined  by 
the  formula 


H-knlloo  -  sup  ||Ln/|b«[a,6]- 
ll/lll,“[a.4]=l 


(4) 


We  will  denote  by  oc\(x), . . .  ,an(x)  the  set  of  polynomials  of  order  n  defined  by  the 
formulae 


Qid)=  n  £::£t 


)  l  1*)  2, ...  ,  TZ, 


(5) 


where  Xi, . . .  ,xk  are  the  Gaussian  nodes  of  order  n  on  the  interval  [o,  6].  It  is  readily  seen 
from  (5)  that  for  any  continuous  function  /  :  [a,  6]  — C, 


Lnf(x)  -  ^2  ak  ■  Pk(x )  =  Y,  f(xi)  •  Qi(x).  (6) 

k=0  i-l 

For  any  natural  n  and  continuous  function  /  :  [a,  6]  ->  C,  we  will  denote  by  Enf  the 
error  of  the  best  approximation  to  /  among  all  polynomials  of  order  n,  i.e., 

En/  =  imn||/-P||i0c[a)6].  (7) 

Let  p  >  0  be  an  arbitrary  positive  real  number.  For  any  analytic  function  f:C-*C,  we 

wiU  denote  by  M([a,  b\,f,p)  the  maximum  of  the  absolute  value  of  /  in  the  p-neighborhood 
of  the  interval  [a,  b],  i.e., 


■MXK&L/iP)  =  sup  sup 
x€[a,6]  0€[— 

The  following  five  lemmas  are  well  known  Their 
[16],  [12]. 


\f{x  +  peiB)\.  (8) 

proofs  can  be  found,  for  example,  in 


Lemma  2.1.  If  n  >  0  is  an  integer,  and  P  :  C  ->  C  is  a  polynomial  of  order  n,  then  for 
any  interval  [a,  6]  C  R, 


(9) 
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(10) 


Lemma  2.2.  For  any  continuous  function  f  :  [a,  6]  — ►  <37, 

11/  “  *^n/||x,oo[a,6]  ^  (1  +  ||Ln||oo)  *  ||/  —  ^7i/||l°®[q,6]- 

Lemma  2.3.  For  any  n  times  continuously  differentiable  function  f  :  [a,  6]  (7, 

11/  -  B»/lk-M  <  (11) 

Lemma  2.4.  If  f  :C  -*C  is  an  analytic  function,  then  for  any  positive  real  p  >  0, 


<  n!  ■  ^4  —■ 


(12) 


Lemma  2.5.  For  any  natural  n, 

ll^nlloo  <  Tl.  (13) 

By  combining  (9),  (10),  (11),  (12),  and  (13),  we  obtain  the  following  theorem  describing 
the  rate  of  convergence  of  Legendre  expansions  of  an  analytic  function  on  the  interval  [a,  6]. 

Lemma  2.6.  Suppose  that  f  :  C  -*■  C  is  an  analytic  function,  and  that  for  some  positive 
p>(b-  a)/ 4, 

M([a,b],f,p)  <  oo.  (14) 

Then 

nlkn  ||/-Ln/||ioeM=0.  (15) 

Furthermore,  for  any  n  >  1, 


11/  /'n/||z,»[o,6]  <  2(1  +n) -M([a,  b],f,  p)  •  (~^~)  •  (16) 

A  standard  approach  to  the  construction  of  polynomial  approximations  of  functions  in 
higher  dimensions  is  to  expand  them  into  “tensor  products”  of  one-dimensional  Legendre 
polynomials.  For  an  m-dimensional  cube  Q  =  [ox ,  h]  x . . .  x  [om ,  6m]  and  continuous  function 
/  .  Q  — >  C,  we  will  denote  by  Lnf  the  (unique)  polynomial  of  m  variables  having  the  form 


n— 1  n-1 

Lnf(xi,...,xm)=  J3  "•  13  a>'i,.,km-Pg’bl(xi)-...-P£’bm(xn),  (17) 

^  1 — 0  km  — 0 

and  coinciding  with  /  on  the  n771  “tensor  product”  Gaussian  nodes 

[Zki  j,  ki  =  =  l,...,n;  (18) 

the  coefficients  are  given  by  the  formula 


n— 1  n— 1 

°*i>~>*m  =  53  ...  53  wkl,bl'- 

*1=0  *m= o 


m-/(*s  .•  •  (^r6m). 

(19) 


3 


In  a  mild  abuse  of  terminology,  we  will  be  referring  to  such  polynomials  as  polynomials  of 
order  n  in  Rm  and  to  expansions  of  the  form  (17)  as  Legendre  expansions  of  order  n  in  the 
cube  Q  €  Rm.  For  an  analytic  function  /  :  Cm  -»•  C,  we  will  denote  by  M(Q,f,p)  the 
maximum  of  the  absolute  value  of  /  in  the  p-  neighbor  hood  of  the  cube  Q,  i.e., 


M{Q,f,p)=  max  sup  sup  \f(xu...,xk  +  pei6,...,xm)\. 


(20) 


The  following  two  lemmas  are  a  simple  consequence  of  Lemmas  2.1  and  2.6;  they  can 
be  viewed  as  multidimensional  analogues  of  the  latter  (see  for  example  [17]). 

Lemma  2.7.  If  n  >  0  is  an  integer  and  P  :  Cm  -*■  C  is  a  polynomial  of  order  n,  then  for 
any  cube  Q  =  [a,  6]m  c  Rm, 


j^p2i|P|li2W)  -  ^  j^fQ[m/2llPil^(Q)-  (21) 

Lemma  2.8.  Suppose  that  f  :  Cm  — >  C  is  an  analytic  function  on  Cm,  and  that  for  some 
positive  p  >  (b  -  a) /A, 

M([a,b]m,f,p)  <  oo.  (22) 

Then,  for  any  n  >  1, 


11/  -  £»/|U~Mm  <2(1  +n)m  •  M([a,br,f,p)  •  (^)”  .  (23) 


2.2  Singular  Value  Decomposition  of  Integral  Operators 
Let  T  :  L2(Y )  L2(X)  be  integral  operator  given  by  the  formula 

(T-f)(x)  =  fK(x,y){{y)dy,  (24) 

where  if  is  a  square  integrable  function  on  X  x  Y,  i.e., 


ll-^(I>y)llL2(Arxy) 


(f  fXxy\K(x,y)\2dxdyy12 


<  +oo. 


(25) 


The  function  K  :  X  x  Y  -4  R  is  usually  referred  to  as  the  kernel  of  the  integral  operator  T. 
The  following  theorem  can  be  found  (in  a  more  general  form)  in  [15]. 


Theorem  2.9.  For  any  K  G  L2(X  x  Y),  there  exist  two  orthonormal  systems  of  functions 
{ufc}  €  L  ( X ),  {u*}  £  L2(Y),  and  a  sequence  of  nonnegative  numbers  s\  >  s2  >  ...  >  0 
for  k  =  1,2, ...,  such  that  ~ 

OO 

K(x,y)  =  Yluk(x)skvk(y),  (26) 

k= l 


in  L2(X  x  Y)  sense, 

l«fc|2  <  +oo, 

k=l 


(27) 


and  the  sequence  {sjt}  is  uniquely  determined  by  K. 
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Formula  (26)  is  normally  referred  to  as  the  singular  value  decomposition  (SVD)  of  the 
operator  T  (or  the  kernel  K).  The  functions  uk  and  vk  are  usually  referred  to  as  the  left 
and  the  right  singular  functions,  respectively,  and  the  numbers  sk  are  referred  to  as  singular 
values  of  the  operator  K  (or  the  kernel  K). 

The  singular  value  decomposition  can  be  used  to  construct  finite-dimensional  approx¬ 
imations  to  the  operators  of  the  form  (24)  and  the  corresponding  kernels  K.  Specifically, 
given  a  positive  real  e  >  0,  one  can  truncate  the  expression  (26)  after  a  finite  number  p  of 
terms,  leading  to  the  expression 


p 

K{x,y) «  Ysu^x)s*vk{y)- 

k= 1 

Now,  if  p  has  been  chosen  in  such  a  manner  that 


(28) 


N 


5T  *2  < «. 

^=p+l 


then  due  to  (26), 


v 

II K(X>V)  -  H“it(®)sfcVifc(2/)||L2(A->cy)  <  £. 
k= 1 


(29) 


(30) 


Theorem  2.10  (Minimal  property  of  the  SVD).  Suppose  that  the  SVD  of  the  opera¬ 
tor  T  :  L2(Y)  L2(X)  with  the  kernel  K  :X  xY  -*•  R  is  given  by  the  formula 


K(x,y)  =  Yluk(x)skVk(y). 
k=l 

Then  for  any  f  e  L2(Y), 

p 

IKr  ’  /)(x)  —  ^2  ^(2:)s*:^||l2(A')  ^  sp+l|l/|ll2(y)) 

k=  1 

where  the  coefficients  bk  are  given  by  the  formula 


(31) 


(32) 


h  =  /  f(y)vk{y)dy. 


(33) 


2.3  Approximation  of  the  SVD  of  Integrals  Operators 

The  following  theorem  is  a  straightforward  generalization  of  Theorem  2.10. 

Theorem  2.11  (Approximation  of  the  SVD).  Suppose  that  the  operator  T  :  L2(Y)  -> 
L  (A)  is  defined  by  (24),  that  there  exist  a  positive  number  6  >  0  and  a  square  integrable 
function  K:X  xY  ->R  such  that 


\\K(x,y)-K(x,y)\\L2{XxY)<6, 


(34) 
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(35) 


end  that  the  SVD  of  K  is  given  by  the  formula 

oo 

K{x,y)  =  ^Uk(x)skVk(y)- 
k=\ 

Then  for  any  f  €  L2{Y), 


V 

IKT  *  XZ  Uk{x)skbk\\L2(x)  <  +  Sp+i)\\f\\L2(Y)i 


where  the  coefficients  6*  are  given  by  the  formula 


h=  f(y)vk(y)dy. 
j  y 

Proof.  Obviously,  (34)  implies 

||  JYK(x,y)f(y)dy-JYk(x,y)f(y)dy\\LHx)<6\ \f\\L2(Y) 

and  from  Theorem  2.10,  we  obtain 


II  fYK(x,y)f(y)dy 


P 

^  Ufc(l)sfcSfc||£,2(^  <  Sp+l||/||£2(y). 
Jfc=l 


Now,  (36)  follows  immediately  from  (38),  (39),  and  the  triangle  inequality. 


(36) 


(37) 


(38) 


(39) 

□ 


3  Analytical  Apparatus 

In  the  remainder  of  this  paper,  we  will  be  assuming  that  all  charges  are  located  in  a  unit 
square  [0, 1]  x  [0, 1]  in  R2. 


3.1  Notation 

We  will  denote  by  y(z**i>fc2)  the  square 


’k\  —  1  ki 

v 

’^2  —  1  ki 

2l  ’  2l\ 

A 

2l  ’F. 

(40) 


Wa6fcV' kl  ~  k2  -  1  wiil  be  referred  to  as  the  level  of  the  square 

Y  ’  l’  2  ,  and  (&i,  £2)  will  be  referred  to  as  the  coordinates  of  the  square  Y^,ki,k 2).  We  wiU 
denote  by  the  union  of  the  square  F^1-*2)  and  its  immediate  neighbors  on  the 

level  l.  We  will  denote  the  subset  X^l’kuk^  of  [0, 1]  x  [0, 1]  by  the  formula 


jy(Ui,*2)  =  [0)  x  \  Z{l,k  (41) 

and  refer  to  X^’k^  as  the  interaction  domain  of  the  square  Y^’kl’kil  In  other  words, 
the  interaction  domain  of  the  square  Y^k »•*»)  consists  of  all  squares  on  level  l  that  are 
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not  immediate  neighbors  of  and  not  Y^l’k »•*»)  itself.  For  consistency,  we  will  also 

referring  to  the  unit  square  [0, 1]  x  [0, 1]  as  F^0,1,1). 

Suppose  now  that  the  function  K  :  F^0*1-1)  x  F^-1-1)  ->  C  is  such  that 


W*2)  [Jy^  \K^dy)  **  <  +00’ 

(42) 

and 

r  /  *  v 

/*«**>  Uxo***  \K{-y'x^dx)  dv<+°°, 

(43) 

for  all  /  >  1,  k\  = 

1, . - . , 2f ,  k2  =  l,...,2i.  For  any  square  F0>*  1  >**) 

we  will  define  the 

integral  operators 

pUMM)  .  £2(Y(iM,k2)j  x,2(x(l,fcl,fc2)), 

(44) 

p(l,ki,k2)  .  £2^{l,ki,k2)'i  £2(F(i,fcl ’**)), 

(45) 

by  the  formulae 

y)iy~ 

(46) 

■  <r)(v)  =  K(y,x)a(z)  dx. 

(47) 

The  function  (P(Ul’*5>  •  a)  6  L2(X^k »•*»))  with  cr  g  L2(Y^k^)  will  be  referred  to 
as  the  potential  due  to  the  charge  distribution  a  on  the  square  F^’*1’*2).  Similarly,  the 

function  ■  a)  €  with  <r  g  L2(X^’k^)  will  be  referred  to  as’  the 

incoming  potential  due  to  some  charge  distribution  a  on  X^l,kl'k2K 
Due  to  (42),  (43),  and  Theorem  2.9,  there  exist  functions 


{uf’ *2)}  €  i2(y(/,fc1,fc2))j  ^out,(Z,fcl,fc2)j  g  L2(YV,kuk2)^ 
{uout,(l,kuk2)y  €  L2{XV’kl’ *2)),  {v‘n>(Z-fci-fc2)|  g  ^(X^L*2)), 

and  positive  real  numbers 

vt,(l,ki,k2)y 

such  that 

JiTfoy)  = 

tel 

*(y,*)  = 

*=1 


(48) 

(49) 

(50) 

(51) 

(52) 


We  will  refer  to  (51),  (52)  as  the  outgoing  and  incoming  singular  value  decompositions  for 
the  square  F0>* i>*2)}  respectively. 

We  will  be  using  finite-dimensional  approximations  to  the  operators  (44),  (45)  obtained 
by  truncating  expressions  (51),  (52)  after  a  finite  number  of  terms.  Specifically,  given  two 
natural  numbers  p\  and  rj,  we  will  define  the  operators 


pVMte)  .  L2 L2(XV’kl'k*)), 
rVMM)  .  L*(Xmto))  L2 (ydA^fca)) 


7 


(53) 

(54) 


by  the  formulae 


(piUkiM) .  a)(a;)  _  jY{i  ^  ki)  Kpi  ( x ,  y)a{y)  dy, 

.  a>(y)  _  J^  ^  ^KTl(y,x)cr{x)dx, 

with 

=  £ur’(,’"1’fc2)(x)sr,('’fcl’fc2)vrt,(z,fcl'fc2)(y)1 

fc=l 

*n(y,*)  = 

fc=l 

Substituting  (57),  (58)  into  (55),  (56),  we  obtain 


(55) 

(56) 


(57) 

(58) 


{Pji[’k i’*2>  •  f)(x)  =  '£u?t’«’k''k>\x)s?t'«’k 

k= 1 

with  the  coefficients  a™t’{l'kl'k*'>  given  by  the  formula 

OUt,(lMM)  _  f  rnt, (/,*!, Jfc2),  x  .  ,  , 

°*  ~JY^2)Vk  (yMy)dy, 


and 


(4l;klM)  ■  a)(y)  = 

k=  1 

with  the  coefficients  ai^{lMM)  given  by  the  formula 


(59) 


(60) 

(61) 


(62) 


The  function  (P^  11  2)  •  a)  €  L2(X(^))  with  a  e  L2(Y«’k^))  wm  be  referred  to 
as  the  outgoing  smgular  function  expansion  due  to  the  charge  distribution  a  on  the  square 
7  ’  11  •  Similarly,  the  function  (P?;kl’k2)  •  a)  €  L2{X<-1  *'**))  with  a  €  L2{Y will 

be  referred  to  as  the  incoming  singular  function  expansion  due  to  some  charge  distribution 


3.2  Singular  Function  Expansions  of  the  Potentials 

The  following  theorem  provides  a  tool  for  approximating  potentials  produced  by  arbitrary 
charge  distributions. 

Theorem  3.1.  Suppose  that  the  outgoing  potential  g(l’ki*ki)  g  L2(X^l,k »«**))  is  induced  bv 
the  charge  distribution  cr(l<ki’ki)  :  p,  {-  e 


MMte) 


(x)  =  (pm*,) .  ^ K(x,y)c^\y)dy. 


(63) 
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Then 


k= l 

with  the  coefficients  given  by  the  formula 

a«*,<!Mto)  =  jf  ^  ^l^(y)v^l*^\y)dy. 

Furthermore,  for  any  p>  1, 

UptW.Wjj.)  .  £u«^i**)(a.)J^,*)<,r«w1*)||ii(jc(ijii  (1))  £ 


and 


]T  |a~t’(i’fcl>fc2)|2  < 

k= i 


(64) 


(65) 


(66) 

(67) 


Proof.  (66)  follows  immediately  from  Theorem  2.10.  Singular  values  s^l,k  1,ki)  converge 
to  zero  as  k  -4  oo;  therefore,  (66)  implies  (64).  Finally,  due  to  (65),  a^1*1^  are  the 
coefficients  in  the  orthonormal  basis  M)},  from  which  (67)  foUows  immediately.  □ 

3.3  Translation  Operators  and  Error  Hounds 

The  following  three  theorems  constitute  the  principal  analytical  tool  for  manipulating  out¬ 
going  and  incoming  singular  function  expansions.  Theorems  3.2,  3.4  provide  formulae  for 
the  translation  of  outgoing  and  incoming  singular  function  expansions,  respectively.  Theo¬ 
rem  3.3  describes  a  mechanism  for  converting  an  outgoing  singular  function  expansion  into 
an  incoming  singular  function  expansion. 

Theorem  3.2  (Outgoing  to  Outgoing).  Suppose  that  the  outgoing  singular  function  ex¬ 
pansion  .  l2(: xV’kite))  R  is  given  by  the  formula 

g<mt,(l,kuk2)(xj  _  guout,(i,fc1>fcs)^^outI(I)fc1,*2)a«1tI(i,Jk1)fca)  ^ 

fc=l 


with  the  coefficients  suc/j 


out,{l,ki,k2),2 


<  +00, 


k=l 


(69) 


and  that  y(*>*i.*2)  q  y(t— i,»ni,mj)> 

Then  there  exists  a  linear  mapping 


(70) 
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converting  the  sequence  oj  coefficients 
m  =  1)  2, . . defined  by  the  formulae 


of  coefficients  {a?t'«’kl’k^})  k  =  1, 2, . . .  into  the  sequence 
f  the  formulae 

oo 

=  V-  .(/-!, mi, s 
m  mk  >  (71) 

__  ly(Wi)V^'^%)v^.V-  1«™)(V)<^,  (72) 

stzcA  t/iat  /or  a//  x  inside  X^~1,mi<m2\ 

gout,(l,kuk2)(x)  =  £  u^t,(/-l,m1)m2)^s^-i,mi)m2)oaut)(/_i,mi>m2)> 

m=l  771 


(73) 


Furthermore,  for  any  p  >  1, 


m=l 


^5p+i 


oo 

E  la 


u<nit,(;  l,m1,m2)  fx\gOut,(l—l,mi,Tn2)out,(l—l,nn,m2)  1 1 

v  '  m  um  llL2(X('-1.'"i.'"2))  < 

(74) 


out  ff,kuk2)  12 


(75) 


Proof.  We  observe  that  can  fie  viewed  as  the  potential 

=  K(X,y)^MMHy)dy 

induced  by  the  charge  distribution  o^2)  :  L2(Y^2))  ->  R,  defined  by  the  formula 

*< lM,k2)(y)  =  f 

fc=l  V  ' 

We  will  denote  by  the  charge  distribution  on  the  square  riven  bv 

the  formula  6  y 

ff(l-l,mi,m2)/  t  _  |  or^’*1,*J)(y)J  if  y  £  y(*.*i.*2)( 

10,  if  y  e  y(*-l."*i,m2)\y(t,*i,*2)  (77) 

and  by  p(*-i.mi,.»2)  the  outgoing  potential  on  X^^2)  due  to  the  distribution  ad-^um) 
on  the  square 


g(l  1  ,m,,m2)^  =  (p(l-l,mi,m2)  .  a(t-l,mitm2)^j  _ 

=  fYv-umi,m2)  K{x>  y)°(l~l'mum2)(y)  dy. 


(78) 


Due  to  Theorem  3.1, 

oo 

P(i"l’mi,m2)(x)  =  E  U^t>('-1-mi^)(z)s^.(/-l,mI,m2)aout>(/_l,mi,m2) 

m=l  ’  V  ' 
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with  the  coefficients  Omt,(/  defined  by  the  formula 

' '“"I'’”1'"’)  =  Jy„. dy.  (JO) 

Now,  using  (77),  we  have 

y)v^‘-^‘^(y)dy.  (81) 

Substituting  (76)  into  (81),  we  arrive  at 

= pr-(UM  <‘«-'-™\y)dy)  - 

1  / 
oo 

=  \  cm£,(i,A:i,fc2)  a  1, mi  ,7712) 

^  mA:  5 


k-1 


where 


Now,  from  the  combination  of  (78)  and  Theorem  2.10,  we  obtain 

11  fy(<-i,mi.m2)  K(x,y)a^m^(y)  dy  - 
p 

m  {X)S™  am  < 


(82) 

(83) 


m= 1 

<  bml >m2)  ||_(<-l,n»i  ,m2 )  1 1 

-  Sp+1  IF  2;||i2(y(l-l,m1,m2)). 


(84) 


Due  to  (77),  we  have 

p 

El£out,(J-1,mi,m2)/  \  ou«,(/-l,mi,m2)acwt,(/-l,mi,m2)||  ^ 

m  '*'“m  am  llL2(A'(,-1’mi.m2))  < 


Thus, 


-5P+1  IF  'llL2(y((,k1,*2)). 


(85) 


V 

Wg^’V’^’^ix)  -  ^  U^t*(i-1-mi,m2)^seut,(i-l,m1,m2)0out,(i-l,m1,m2)|| 

m=  1 


<-  out,(/-l,mi,m2) 

-  5p+l 


N 


|0™*.(^*2)|2 


k= 1 


|i2(_S:(<-l.tni.m2))  < 


(86) 


f“fUy’  ]h®  smgular  vaIues  *?’{l'kuk2)  converge  to  zero  as  k  -*  oo;  therefore,  (86)  implies 
(73),  and  from  the  combination  of  (76),  (77),  (80),  we  have 


T7l=l  '  • 


k=l 


(87) 
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□ 

The  proof  of  the  following  two  theorems  is  virtually  identical  to  that  of  Theorem  3  2 
and  is  omitted.  '  ’ 


Theorem  3.3  (Outgoing  to  Incoming).  Suppose  that  the  outgoing  singular  function  ex¬ 
pansion  .  L2(X(‘*M)  -+  R  is  given  by  the  formula 


uu 

_  y*  ^sout,(Uki,k2) aout,(l,kiM) 


k=  1 


(88) 


with  the  real  coefficients  a™t'{l'kuk2'>  such  that 


Ei 

ifc=i 


out,({,fci,*2)|2 


<  +00, 


(89) 


and  that  Y^l,m  1,m2)  c  X^,kl,k2\ 

Then  there  exists  a  linear  mapping 


B{l,mum2UlMM)  .  /2(^  ^  /2(iv)  (90) 

converting  the  sequence  of  coefficients  {a?t’(l'kl’k2)},  A:  =  1,2,...  into  the  sequence  ) 

m  —  1, 2, . . defined  by  the  formulae 

oc 

=  o(^mi ,1712), out, 

m  ajfc  > 


fc=l 

Bar***** = jf, . 

such  that  for  all  x  inside  Y^l,mi’m2\ 

00 

gOut,(l,ki,k2)(xj  _  u»n,(i, mi, (/,mllm2)a*n,(i,mi,m2) 

m=l 

and 

00  OO 

|atn,(J,mi,m2)|2  ^  |aout,(i,fci,A:2)|2 


m=l 

Furthermore,  for  any  p  >  1, 


Jt=l 


(91) 

(92) 

(93) 

(94) 


||^0t,t,^’*1’*2)(x)  -  utn,(/,mi,m2)^xjsin,(/,mi,m2)atn,(J,ml,m2)|| 


m=l 


£,2(y(/,mltm2)) 


/  in, m2) 
E  -Sp+i 


Ei  aOUt,(^.fcl.*2)|2 

k= 1 


(95) 
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Theorem  3.4  (Incoming  to  Incoming).  Suppose  that  the  incoming  singular  fundi 
expansion  i»^2))  — >  Jl  is  given  by  the  formula 


oo 

gin,{lMM)  ^  _  Y'  ^sin,{lMM) Qin,{l,ki  ,k2) 


k=l 


(96) 


with  the  coefficients  affi^t,kl,k 2)  such  that 


E  <  +«, 


k= 1 


(97) 


and  i/iai  y(l+l,mi,m2)  ^  y(i,fc i,At2)_ 

T/ien  t/iere  exists  a  linear  mapping 


(98) 


converting  the  sequence  of  coefficients  {a??'{lM’k2)},  k  =  1,2,...  into  the  sequence  {a£’(,+1’mi’mj)}, 
m  =  1,2,...,  defined  by  the  formulae 


uu 

__  /^rO+^mi^ ),(*,*! ,fc2)  otit,(Z,fci,*2) 

m  “  ak  i 


Ar=l 


where 


suc/i  that  for  all  y  inside  yO+bm i»m2) 


(99) 

(100) 


oo 

gin,(l,kuk2) ^ 


m=l 


and 


m— 1 

Furthermore ,  for  any  p  >  1, 

p 

i 

771=1 


w  OO 

y  |a^,(/+1,mi,m2)|2  <  y  |ayi,(i,fcl'*2)|2 


Jfe=l 


||ffi7l,(^1,^2)(a:)  —  y  ^in, (i+l, mi , m2) (:r)5in,(i+1, mi, m2)flm,(/+l, mi ,7712)) | 


m,(/+l,mi,m2) 

-  5p-f  1 


N 


y  |a«*)(/,*i,*2)|2 

Jb=l 


(102) 

£,2(y(/+l,m1  ,m2))  < 

(103) 


3.4  Singular  Value  Decompositions  of  Translation  Operators 

The  algorithm  of  the  following  section  (like  its  counterpart  for  harmonic  fields  described, 
for  example,  in  [3])  depends  on  the  efficient  application  of  the  translation  operators  (70)’ 
(90),  (98)  to  arbitrary  vectors.  Clearly,  these  operators  convert  functions  on  the  square 
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into  functions  on  the  square,  and  could  be  extremely  expensive  to  dead  with  numerically. 
Fortunately,  Theorems  2.7,  2.8  of  Section  2  guarantee  that  (asymptotically  speaking)  the 
cost  of  applying  each  of  the  operators  (70),  (90),  (98)  to  an  arbitrary  vector  is  of  the  order 

c  +  d-  log(e)4,  (104) 

with  the  constants  c,  d  independent  of  the  operator  to  be  applied  (as  long  as  the  conditions 
of  Theorem  2.8  are  satisfied).  We  will  discuss  the  procedure  for  the  efficient  numerical 
evaluation  of  the  operator  (90)  in  some  detail;  the  operators  (70),  (98)  are  in  this  respect 
identical  to  the  operator  (90). 

Let  us  consider  the  operator  (90)  with  some  mj,  m2,  k\,  £2-  Choosing  some  natural  n,  we 
construct  an  n  x  n  tensor-product  Gaussian  discretization  of  each  of  the  squares  y0>mi>m 2) 
y(  ■*>>*’),  and  expand  the  kernel  K  on  x  into  a  4-dimensional  tensor 

product  Legendre  series.  Due  to  Theorem  2.8,  the  error  of  such  an  expansion  is  bounded 
by 

b( 1  +  n)4  •  qn ,  (105) 

where  b  is  a  positive  constant  and  |g|  <  1.  Choosing  n  =  c  +  d  ■  log(e),  we  guarantee  that 
the  error  of  our  expansion  is  less  than  any  arbitrary  a-priori  prescribed  e.  An  examination 
of  (105)  shows  that  the  length  of  the  expansion  required  to  obtain  reasonable  accuracy 
is  not  excessive,  though  it  is  considerably  greater  than  the  lengths  expansions  required 
for  harmonic  kernels  (see,  for  example,  [3]).  An  additional  improvement  in  the  required 
lengths  of  expansions  is  obtained  by  replacing  the  tensor-product  Legendre  expansions  of 
the  operators  (70),  (90),  (98)  with  their  Singular  Value  Decompositions  via  Theorems  2.9, 
2T0,  2.11.  The  cost  of  this  latter  step  (in  terms  of  CPU  time  requirements)  is  of  the  order 
p  ,  and  would  be  excessive,  except  for  the  fact  that  this  procedure  has  to  be  performed  only 
once  for  each  kernel,  since  the  necessary  SVDs  can  be  precomputed  and  stored;  needless  to 
say,  this  requires  an  amount  of  storage  proportional  to  p  •  r? . 

Remark  3.5.  The  situation  is  simplified  when  the  kernel  K  is  convolutional,  i.e  depends 
only  on  the  difference  between  its  arguments.  Indeed,  in  this  case,  the  SDVs  of  the  trans¬ 
lation  operators  A(i~1>mi>m2),(i,fci,fcj)>  £(i,mi,m2),(i,ki,k2) ,  Q{i+i,mum2),(i,ki,k2)  fo  not  /jaue  to 

be  calculated  for  all  interacting  pairs  of  squares  on  all  levels,  but  only  for  all  interactions 
of  a  single  square  on  each  level.  In  this  case,  the  construction  of  the  SVDs  requires  trivial 
amounts  of  both  CPU  time  and  disk  space.  When  the  kernel  K  is  not  only  convolutional  but 
possesses  additional  symmetry  ( rotational ,  up-down,  etc.)  the  situation  is  further  simplified. 

4  Generalized  Fast  Multipole  Method  in  Two  Dimensions 

4.1  Notation 

In  this  section  we  will  introduce  the  notation  to  be  used  in  the  description  of  the  algorithm. 

For  any  subset  A  of  the  computational  box,  T(A)  will  denote  the  set  of  particles  inside 

A. 

Bt  is  the  set  of  all  nonempty  boxes  at  the  level  l.  B0  will  denote  the  computational  box 
itself. 
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If  box  contains  more  than  s  particles,  it  is  called  a  parent  box.  Otherwise,  the  box  is 
said  to  be  childless.  Note  that  s  is  the  maximum  number  of  points  in  a  childless  box. 

A  child  box  is  nonempty  box  obtained  from  the  division  of  a  parent  box  into  four. 

Colleagues  are  adjacent  boxes  of  the  same  size  at  the  same  level.  A  given  box  has  at 
most  eight  colleagues. 

Two  boxes  b  and  c  are  said  to  be  well  separated  if  they  are  separated  a  distance  greater 
or  equal  to  the  length  of  the  size  of  the  smallest  box. 

With  each  box  b  at  the  level  l ,  we  will  associate  five  lists  of  other  boxes. 

List  1  of  a  box  b  will  be  denoted  by  Ub.  It  is  empty  if  b  is  a  parent  box.  If  b  is  childless, 
it  consists  of  6  and  of  all  childless  boxes  c  that  are  adjacent  to  b. 

List  2  of  a  box  b  wiU  be  denoted  by  Vb.  It  consists  of  all  boxes  c  that  are  children  of  the 
colleagues  of  the  b' s  parent  and  that  are  well  separated  from  b. 

List  3  of  a  box  b  will  be  denoted  by  Wb.  It  is  empty  if  b  is  a  parent  box.  If  b  is  childless, 
it  consists  of  all  descendants  of  fe’s  colleagues  whose  parent  are  adjacent  to  6  but  who  are 
not  adjacent  to  b  themselves.  Note  that  b  is  separated  from  each  box  c  in  Wb  by  a  distance 
greater  or  equal  to  the  length  of  the  size  of  c. 

List  4  of  a  box  b  will  be  denoted  by  Xb.  It  consists  of  all  boxes  c  such  that  b  £  Wc.  Note 
that  all  boxes  in  List  4  are  childless  and  larger  that  b. 

List  5  of  a  box  b  will  be  denoted  by  Yb.  It  consists  of  all  boxes  c  that  are  well  separated 
from  6’s  parent. 

$6  will  denote  the  p-term  outgoing  singular  function  expansion  for  the  box  b. 

’J'b  will  denote  the  p-term  incoming  singular  function  expansion  for  the  box  b. 

Tf,  will  denote  the  p-term  incoming  singular  function  expansion  for  the  box  b  due  to  all 
particles  in  T(Vb). 

Ab  will  denote  the  p-term  incoming  singular  function  expansion  for  the  box  b  due  to  all 
charges  in  T(Xb). 

^i(r)  is  the  result  of  evaluation  of  the  expansion  at  a  particle  r  £  T(b). 

ab(r)  will  denote  the  potential  at  r  £  T(b)  due  to  all  particles  in  T(Ub). 

(3b(r)  will  denote  the  potential  at  r  £  T(b)  due  to  all  particles  in  T(Wb). 

TiM  will  denote  the  potential  at  r  £  T(6)  due  to  all  particles  in  T(Yb). 

F(r)  will  denote  the  potential  at  r. 

AbtC  will  denote  the  translation  operator  (a  p  x  p  matrix)  in  the  Theorem  3.2  for  the 
boxes  b  and  c  such  that  6  =  yO-1.m1.m2)  ^  c  _  y(z, fci,*2) 

B6iC  will  denote  the  translation  operator  (a  p  x  p  matrix)  in  the  Theorem  3.3  for  the 
boxes  b  and  c  such  that  b  =  yO.m i.m2)  3^  c  _  y(Z,fc i,fc2) 

C6,c  will  denote  the  translation  operator  (a  p  x  p  matrix)  in  the  Theorem  3.4  for  the 
boxes  b  and  c  such  that  b  =  y0+i.mi,m2)  ^  c  _  y(t,k!,k2) 

4.2  Informal  Description  of  the  Algorithm 

1.  Create  the  adaptive  quad-tree.  Compute  the  outgoing  and  incoming  singular  functions 

for  each  box  in  the  computational  tree,  by  the  means  of  the  Theorem  2.11. 

2.  For  each  childless  box  b,  the  interactions  between  particles  in  T(b)  and  T(Ub)  are 

evaluated  directly.  For  each  particle  r  £  T(b)  the  result  is  ab(r). 
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3.  For  each  childless  box  b,  form  an  outgoing  singular  function  expansion  by  the 
means  of  Theorem  3.1.  For  each  parent  box  b,  use  Theorem  3.2  to  translate  and 
merge  the  outgoing  singular  function  expansions  of  its  children  into  the  outgoing 
singular  function  expansion 

4.  Use  Theorem  3.3  to  convert  the  outgoing  singular  expansion  of  each  box  in  Vb  into  the 
incoming  singular  function  expansion  in  the  box  b,  adding  the  resulting  expansions 
together  to  obtain  Tb. 

5.  Convert  the  potential  of  all  particles  in  T(Xb )  into  a  incoming  singular  function  ex¬ 
pansion  in  the  box  b,  adding  the  resulting  expansions  to  obtain  Afc.  Add  A6  to  r6. 

6.  For  each  childless  box  b,  evaluate  the  potential  (3b(r )  due  to  all  particles  in  T(Wb)  by 
evaluating  the  outgoing  singular  function  expansions  $c  for  each  box  c  6  Wb. 

7.  Translate  the  incoming  singular  function  expansion  r#  of  b's  parent  B  to  the  box  b 
by  the  means  of  Theorem  3.4.  Add  the  resulting  local  expansion  to  IV 

8.  For  each  childless  box  6,  evaluate  the  local  expansion  T;,  at  every  particle  r  €  b  and 
add  the  result  to  ab(r)  and  Pb{r),  obtaining  the  potential  F(r)  at  r. 

4.3  Detailed  Description  of  the  Algorithm 

Step  1:  Initialization 

Comment  [  Set  the  order  n  of  Legendre  expansions,  the  number  of  terms  p  in  all  singular 
function  expansions,  and  the  maximum  number  $  of  the  particles  in  a  childless  box.  Create 
the  computational  tree.  ] 

do  /  =  0,1,2,... 

do  be  Bi 

if  b  contains  more  than  s  particles  then 
subdivide  b  into  four  smaller  boxes, 
ignore  empty  boxes,  add  nonempty  boxes  to  B\+ 1. 
endif 

enddo 

enddo 


Comment  [  For  each  box  b  in  the  computational  tree,  compute  the  outgoing  and  incoming 
singular  value  decompositions  of  the  kernel  K.  ] 

do  l  -  0,1,2,... 
do  b  e  Bi 

^e,*,  !*  =>  Compute  two  singular  value  decompositions  for  x  € 
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k= 1 

*(».*)  =  £«&(»)•  4  •»&(*)• 

fc=i 

enddo 

enddo 


Step  2:  Local  Interactions 

Comment  [  For  each  childless  box  b,  evaluate  interactions  with  the  particles  in  T(Ub) 
directly,  obtaining  the  potential  due  to  nearby  particles.  ] 

do  l  =  0,1,2,... 

do  6  €  Bi,  b  is  childless 

do  n  eT(b),xj  ET(Ub) 


enddo 

enddo 

enddo 


a6(z,-)  =  Qb(xi)  +  K(2i,Xj)- 

j 


Cost  [  9 (N / s)  •  s  ■  s  +  8{N/s)  ■  s  ■  s  operations.  ] 


Step  3:  Outgoing  Singular  Function  Expansions 

Comment  [  For  each  childless  box  6,  form  the  outgoing  singular  function  expansion  $b.  ] 

do  /  =  0, 1, 2, . . . 

do  b  €  Bi,  b  is  childless 

Evaluate  the  coefficients  of  the  outgoing  singular  function  expansion  for  the 
square  b  by  the  means  of  the  Theorem  3.1., 


for  all  k  =  1, . . .  ,p. 
enddo 
enddo 


=  £  9j  •  ^(Xj), 

Xj£b 
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Cost  [  Np  operations.  ] 


Step  4:  Upward  Sweep 

Comment  [  For  each  parent  box  b,  form  the  outgoing  singular  function  expansion  by 
translating  the  outgoing  singular  function  expansions  of  b's  children  and  adding  the  resulting 
expansions  together.  ] 

do  l  =  ...  ,2,1,0 

do  6  G  Bi,  6  is  a  parent  box 

Use  Theorem  3.2  to  translate  and  merge  the  outgoing  Singular  function  ex¬ 
pansions  of  b's  children  b\,  &2,  63,  64  into  the  outgoing  singular  function 
expansion 

$b  =  $b  +  AbM  •  $6l  +  AbM  •  +  Abib 3  •  +  Ab<b, ,  •  $64 

enddo 

enddo 


Cost  [  ( 4/Z)(N/s )  •  p2  operations.  ] 


Step  5:  Adaptive  Part 

Comment  [  For  each  childless  box  b,  form  the  incoming  singular  function  expansion  Ab 
due  to  particles  located  in  List  4  of  b.  ]  ‘ 

do  /  =  0,1,2,... 

do  b  G  Bi ,  b  is  childless 

Use  Theorem  3.1  to  evaluate  the  coefficients  of  the  incoming  singular  function 
expansion  Ab  for  the  square  6 


for  all  k  =  1 ...  ,p. 
enddo 
enddo 


Ab;k=  Yi  ft  •<*(*«)> 


Cost  [  8 (N/ s)  -  p-  s  operations.  ] 

Comment  [  For  each  box  b,  evaluate  the  outgoing  singular  function  expansion  at  each 
particle  located  in  boxes  c  in  List  4  of  b.  ] 

do  /  =  0,1,2,... 

do  b  G  Bi,  b  is  childless 
do  n  G  Xb 
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enddo 

enddo 

enddo 


fib(xi)  =  Pb(xi)  +  $6; 


k  ■ s 


k=l 


f UJ  .  ..out 
b;k  ub:k 


(*i). 


Cost  [  8 (N/s)  -p-  s  operations.  ] 


Step  6:  Outgoing  to  Incoming 

box  c  in  LiJ  2  of  t taco2ng' functto^”  fl“C‘i0?  “P^™  for 
expansions  together.  ]  S  S  ninction  expansion  Tb,  adding  the  resulting 


do  /  =  0,1,2,... 

do  be  Bt 

fhe  Mco^ng  sin6^krCOto«ioneeOUtEO“S  function  “P*”5™  too 

Theorem  3i  Add'the  ^  ^  * 

r6  =  r6  +  Bie .  $c> 

c£VJ, 

Add  r6  and  A6  to  obtain  the  incoming  singular  function  expansion  6 


enddo 

enddo 


^6  —  -f  a&. 


Cost  [  27  •  (4/3)(jV/s)  ■  p2  operations. 


] 


Step  7:  Downward  Sweep 

» L' ts.'SE&r tsrr* 

do  /  =  0, 1, 2, . . . 

do  6  e  Bi,  b  is  a  parent  box 
do  c  6  Bl+l,  cisab's  child 

Vc  =  +  CCi6  .  tf6. 
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enddo 

enddo 

enddo 


Cost  [  (4/3 )(N/s)  -p2  operations.  ] 


f°r  k r-  Chil?6SS  b°X  b'  6'”alUate  incominS  “ngular  function  expansions 

bv  adding  M  t  rmng/  P0tennal  due  t0  distant  Particles.  Find  the  potential  at  r  £  6 
by  adding  a4(r),  (3b(r),  7b(r)  together.  ] 

do  /  =  0,1,2,... 

do  b  £  Bi,  b  is  childless 

For  each  particle  Xj  £  6.  evaluate 


7b(Xj)  =  £  *b;k  ■  4-k  ■  «&(*;)■ 
fc=l 

Add  a6(xj),  /3i,(xj),  74(xJ)  to  obtain  the  potential  F(Xj)  at  Xj  £  b 
F{Xj)  =  ab(Xj)  —  (3b(Xj)  +  7b(Xj). 

enddo 

enddo 


Cost  [  N  •  p  operations.  ] 


4.4  Complexity  of  the  Algorithm 

Since  s  is  the  average  number  of  particles  in  a  childless  box  at  the  finest  level  there  are 
approximately  N/s  childless  boxes,  and  approximately  ’  ^ 


B  =  (1  +  1/4  +  1/42  +  ...)  •  (N/s)  = 


3  s 


(106) 


boxes  in  the  tree  hierarchy  Therefore,  Step  3  requires  Np  work,  Step  4  requires  Bp 2  work 
Step  6  requires  27£p2  work,  Step  7  requires  Bp2  work,  Step  8  requires  Np  work  Ld  Sten 

cr.T  ' ' 'N,S  ■ !  S  ■  9Ns  *  -ttaal  for  thl  «0°i’opiS 

(107) 


9 Ns  +  2 Np  +  29 Bp2  =  9 Ns  +  2 Np  +  29  •  -  •  —  .  p2, 

3  s 


^ith  s  =  2 p,  the  operation  count  becomes  approximately 

40  Np. 


(108) 
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enddo 

enddo 

enddo 


&(*»)  =  0b(xi)  +  E*  6;*  *  S 


out 

b:k 


U, 


out 

b:k 


(**)• 


jfc=i 


Cost  [  8 (N/s)  ■  p  •  s  operations.  ] 


Step  6:  Outgoing  to  Incoming 

Comment  [  For  each  box  6,  convert  the  outgoing  singular  function  expansion  $c  for  each 

box  c  m  List  2  of  6,  into  the  incoming  singular  function  expansion  Tb.  adding  the  resulting 
expansions  together.  ]  ° 

do  l  =  0.1,2.... 
do  b  €  Bi 

For  all  boxes  c  €  Vb,  convert  the  outgoing  singular  function  expansion  into 
the  incoming  singular  function  expansion  for  the  box  b  by  the  means  of 
Theorem  3.3.  Add  the  resulting  singular  function  expansions  to  r6 

r6  =  r6  +  £  BbtC  ■  $c. 

cev6 

Add  rfc  and  Ab  to  obtain  the  incoming  singular  function  expansion 


*b  =  r6  + 

enddo 

enddo 


Cost  [  27-  (4/3)  (N/s)  ■ p 2  operations.  ] 


Step  7:  Downward  Sweep 

,c„“  [  FOr  ev'"  paie“  b0X  b' trimslate  the  incominS  =“sulai  function  expansion  % 
to  b  s  children  incoming  singular  function  expansions.  ] 

do  /  =  0,1,2,... 

do  b  £  Bi,  b  is  a  parent  box 
do  c  €  Bi+i,  c  is  a  b's  child 

Translate  the  incoming  singular  function  expansion  %  by  the  means  of  The¬ 
orem  3.4.  Add  the  resulting  local  expansion  to 


=  Vc  +  CCib  •  *ft. 
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digits,  we  used  90-term  singular  function  expansions,  and  obtained  these  (durin-  the  pre- 
computation  stage)  by  starting  with  Legendre  expansions  of  order  16 

2, ^°r.  2:digit  “  f  the  sdl™e.  break-even  point  with  the  direct  scheme  is 

L7»;  ,h  ?‘tS’  be  breaj!'eve“  P°int  is  "  ~  800,  and  for  10-digits  the  scheme  becomes 
faster  than  the  direct  one  at  n  ^  3000. 

3.  The  efficiency  of  the  algorithm  does  not  suffer  significantly  when  the  charges  in  the 
simulation  are  clustered.  On  the  other  hand,  unlike  its  counterpart  for  harmonic  kernels 

pJtS;^r  does  not  seem  ,o  derive  “y  adiMa6e  fr°m  the  ciusterins  of 

Th  4‘  The+f°St  °f  the  algorithm  grows  rapidly  with  the  increase  of  accuracy  requirements 
The  algorithm  is  considerably  slower  than  modern  versions  of  the  FMM  for  harmonic  fields 
especially  in  high-accuracy  environments  (see,  for  example,  [10]). 


5.1  Generalizations  and  Conclusions 


The  algorithm  of  this  paper  has  an  obvious  analogue  in  three  dimensions:  quad-trees  are 
replaced  with  oct-trees,  two-dimensional  expansions  are  replaced  with  thridimensional 

Sen* ild7rTrmiIlg  beCOmeS  m0re  iDVOlVed-  Su<±  a  scheme  has  been  implemented 
(see  [6]),  and  found  to  work  satisfactorily,  as  long  as  the  required  precision  is  low  For  accu 

ST ^  ^  CPU  *“  —  three-dimensional 

For  many  kernels  the  algorithm  of  this  paper  can  be  accelerated  via  an  approach  similar 

t^7ne  US6d  7  4  ’i9]’  [1°]  t0  accelerate  the  FMM  harmonic  fields  in  two  and 
three  dimensions.  Specifically,  most  the  operators  (70),  (90),  (98)  can  be  diagonalized- 

is  requires  that  the  kernel  K  be  approximated  by  linear  combinations  of  exponentials  on 
appropriately  chosen  parts  of  the  product  Y^l,k i>**)  x  Y(h*i»*2)  Niwllocc  + 
not  be  done  for  a  .genera,-  kerne, "if;  however,  i,  appet  to  be  ££?«? '  Z, tS 
(and  classes  of  kernels)  of  interest.  Such  a  scheme  would  require  several  developments  (both 
analytic  and  numerical);  it  would  accelerate  the  two-dimensional  version  of  the  algorith^ 
significantly.  The  real  pay-off  of  such  a  project  would  be  in  three  dimensions,  where  it 
would  be  likely  to  make  large-scale  high-precision  simulations  feasible. 
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The  adaptive  part  of  the  algorithm  in  the  Step  5  requires  0(8(N/s)ps  -  8(\/s)p^  = 

0(16A'p)  work,  and  Step  3  requires  additional  0(S(Ar/s)s2)  =  0[8Ns)  work  *The  tota7 
operation  count  is 


17A?s  -  18 Ap  -f  291 3p7  =  17Ars  -  18A:p  -  29(4/3)  {N/s)p2. 
By  setting  s  =  1.5p.  the  operation  count  becomes  approximately 

69  Np. 


(109) 


(HO) 


5  Numerical  Results 


A  FORTRAA  program  has  been  written  implementing  the  algorithm  described  in  the 
preceding  section.  All  timings  listed  below  correspond  to  calculations  performed  on  an 
LltraSparc-I/16,  computer  with  128MB  RAM.  using  double  precision  arithmetic.  The  or- 
der  of  Legendxe  expansions  was  n  =  4.  n  =  8,  and  n  =  16  and  the  number  of  sm-ular 

uncfons  var.ed  from  p  =  9  to  p  =  36  to  p  =  90  in  order  to  achieve  roughlv  3.  6  and  10 
digits  accuracy,  respectively.  °  '  '  u 

The  results  of  these  experiments  are  presented  in  the  tables  below.  The  first  column  con¬ 
tains  the  number  of  particles  used  in  the  simulation.  The  second  column  contains  the  time 
for  construction  of  the  computational  tree  and  precomputation  of  values  singular  functions 
at  locations  of  particles.  This  can  be  done  once  for  any  given  configuration  of  particles 

,  d°  DOt  mdLude  the  time  for  Precomputation  of  singular  value  decompositions  in  this 
column  since  this  can  be  done  in  advance  for  any  given  kernel.  The  third  column  contains 
he  total  run  time  of  the  algorithm.  The  fourth  and  the  fifth  columns  contain  the  actual 

Fmalh-reth  l  tt!  S°nithm  and  the  tlme  Iequired  by  the  direct  algorithm,  respectively. 

error  E  nhf  7°  C°lumnS  COntain  the  relative  2-norm  ^  and  the  relative  maximum 
error  obtained  at  any  one  particle.  They  are  defined  by  the  formulae 


E2  =  ( I fj  ~  /il2 

V  EtA=ll/:|2 


1/2 


—  max 


\fi  ~/tl 

l/.l 


(111) 


where  /,■  is  the '.  value  of  the  potential  at  the  z-th  particle  position  obtained  by  the  direct 
calculation,  and  /,  is  the  result  obtained  by  the  algorithm. 

J®1  p6  S6t  °f ,teStS’  fthe  positions  of  Particles  were  uniformly  distributed  in  the  unit 

a?onTtun°rn  °f  ^  tw°  fifth  of  charSed  Particles  were  distributed  uniformly 

along  two  ellipses  and  the  remaining  of  particles  were  distributed  randomly  in  three  circles 

«?9  UZTo  T,T  Th\nUmb/r  0f  terms  in  the  5“^  Zt t 

respectively!  ’  '  *  *  1:1111(11655  box  was  set  *°  I5'  61,  and  153, 

Several  observations  can  be  made  from  Tables  1-12  below,  and  from  the  more  extensive 
numerical  experiments  performed  by  the  authors. 

1.  The  number  of  singular  functions  required  to  obtain  3-digit  accuracy  is  9-  the  cor 
responding  order  of  the  Legendre  expansions  is  4.  The  6-digit  scheme  requires  36-term 
ingular-function  expansions,  and  Legendre  expansions  of  order  8.  In  order  to  obtain  10 
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N 

1  Tinit(s) 

1  -^a/c(5) 

1  2riin(>5) 

rrfl>(5) 

£2 

eZ 

200 

0.007 

0.009 

0.015 

0.019 

0.11770E-03 

0.85266E-03 

400 

0.015 

0.018 

0.034 

0.076 

0.27390E-03 

0.19749E-02 

800 

0.024 

0.047 

0.071 

0.310 

0.29473E-03 

0.20307E-02 

1600 

0.062 

0.089 

0.151 

1.344 

0.39506E-03 

0.36146E-02 

3200 

0.105 

0.213 

0.318 

5.371 

0.42503E-03 

0.38485E-02 

6400 

0.266 

0.399 

0.666 

21.783 

0.49194E-03 

0.43736E-02 

Table  1:  Uniformly  distributed  particles.  K(x,y)  =  \/\x  -  y|,  s  =  15,  p  =  9,  and  n  =  4 


N 

Tinit{s)  Tai„(s) 

Trun(s)  I  Tdtr(s)  |  Eo 

400 

800 

1600 

3200 

6400 

U.U66 

0.124 

0.255 

0.492 

0.997 

0.042 

0.130 

0.251 

0.684 

1.230 

0.107 

0.254 

0.505 

1.176 

2.227 

0.075 

0.309 

1.347 

5.375 

21.756 

0.37968E-07 

0.30664E-07 

0.59016E-07 

0.67426E-07 

0.16065E-06 

0.36455E)-06 

0.23301E-06 

0.63131E-06 

0.67145E-06 

0.16568E-05 

Table  2:  Uniformly  distributed  particles.  K(x,y)  =  \f\x  -  y|,  ,  =  61,  p  =  36,  and  „  =  8 


N 

Tinit{s) 

Talq(s) 

■^run(«s) 

Tdir(s) 

£2 

£00 

800 

0.832 

0.213 

1.045 

0.316 

0.35519E-11 

0.27597E-10 

1600 

1.625 

0.580 

2.205 

1.342 

0.27911E-11 

0.23206E-10 

3200 

3.210 

1.374 

4.515 

5.371 

0.47909E-11 

0.35374E-10 

6400 

6.301 

3.138 

9.438 

21.798 

0.40687E-11 

0.47116E-10 

Table  3:  Uniformly  distributed 


particles.  K(x,y)  =  \/\x  -  „|,  s  =  153j  p  =  90,  and  n  =  16. 


N 

|  Trun(S) 

Tdir{s) 

1  £2  1 

200 

400 

800 

1600 

3200 

6400 

0.007 

0.015 

0.024 

0.063 

0.105 

0.267 

0.007 

0.016 

0.037 

0.077 

0.173 

0.353 

0.014 

0.031 

0.061 

0.140 

0.278 

0.619 

0.014 

0.055 

0.227 

1.016 

4.064 

16.397 

0.33680E-06 

0.24487E-05 

0.75789E-05 

0.36380E-04 

0.10114E-03 

0.42311E-04 

0.11237E-02 

0.46567E-02 

0.67792E-02 

0.82441E-02 

0.11347E-01 

0.12510E-01 

Table  4.  Uniformly  distributed  particles. 


y)  =  l/|x  -  y|2,  s  =  15,  p  =  9,  and  n  =  4. 
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^  1 

!  Talg(s) 

i  -^run('S) 

!  Tdir(s) 

| 

!  E ^  i 

800  0.826 

0.180 

1.006 

0.231 

0.14987E-12 

0.17505E-09 

1600  1.597 

0.450 

2.047 

1.009 

0.32363E-12 

0.74589E-10 

3200  3.205 

1.217 

4.422 

4.104 

0.20036E-11 

0.25330E-09 

6400  6.315 

2.507 

8.823 

16.404 

0.46900E-12 

0.16662E-09 

Table  6:  Uniformly  distributed  particles.  K(x,y)  =  l/|*-yp,  ,  =  153,  p  =  90,  and  n  =  16 


Figure  o:  Highly  non-uniformly  distributed  particles  and  the  associated 
computational  box. 


partition  of  the 


Figure  4:  Uniformly  distributed  particles  and  the  associated  partition  of  the  compi 


N 

Tinit  (5) 

Tatg(s) 

1  ^rim(s) 

^dir  ($) 

r  & 

E0 e 

400 

800 

1600 

3200 

6400 

0.065 

0.126 

0.254 

0.493 

0.996 

0.033 

0.098 

0.210 

0.529 

1.036 

0.099 

0.223 

0.465 

1.022 

2.031 

0.055 

0.225 

1.016 

4.090 

16.365 

0.33610E-09 

0.74619E-09 

0.59034E-08 

0.18124E-07 

0.14692E-07 

0.96336E-06 

0.55977E-06 

0.21584E-05 

0.17612E-05 

0.47616E-05 

Table  5:  Uniformly  distributed  particles.  K(x,  y)  =  l/|x  -  s  =  61,  p  =  36,  and  „  =  8. 
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N 

1  ^inxt(s) 

_Taig(s)  1  Trun(s)  I  Tdtr(s)  |  S  E  • 

400 

800 

1600 

3200 

6400 

0.016 

0.029 

0.058 

0.115 

0.225 

0.023 

0.045 

0.100 

0.197 

0.382 

0.039 

0.075 

0.158 

0.312 

0.608 

0.055 

0.226 

1.016 

4.064 

16.405 

0.44531E-04 

0.72969E-04 

0.98016E-04 

0.24054E-03 

0.23213E-03 

0.19765E-02 

0.37896E-02 

0.70910E-02 

0.57700E-02 

0.82506E-02 

10:  Highly  non-uniformly  distributed  particles.  K{x,y)  =  l/|x  -  y|2, 


5  =  15.  p  =  9, 


N 

Tinit($) 

rai<j(s) 

2rtm(s) 

£2 

Sc 

400 

0.065 

0.045 

0.110 

0.054 

0.61825E-08 

0.15019E-05 

800 

0.140 

0.108 

0.247 

0.234 

0.10608E-07 

0.20936E-05 

1600 
on  An 

0.265 

0.312 

0.577 

1.016 

0.13661E-07 

0.18906E-05 

3200 

A  A  AA 

0.521 

0.639 

1.160 

4.059 

0.38933E-07 

0.21694E-05 

6400 

1.043 

1.439 

2.481 

16.408 

0.38956E-07 

0.61407E-05 

*  l1:8HisUy  "“•“"ttmnly  distributed  particles.  K(x,y)  =  1/|*  -  „|J,  s  =  61,  p  =  36, 


N 

Ttrnt(s)  |  T^s)  |  Trun(s) 

T*r(*)  1  £2  1  E~  1 

800 

1600 

3200 

6400 

U.o05 

1.717 

3.338 

6.540 

0.192 

0.477 

1.352 

4.045 

0.996 

2.194 

4.691 

10.586 

0.230 

1.010 

4.144 

16.411 

0.10539E-11 
0.68055E-12 
0.28719E-11 
0.29936E-11  | 

0.41111E-09 

0.18332E-09 

0.39139E-09 

0.21587E-09 

Table  12:  Highly 
and  n  =  16. 


non-uniformly  distributed  particles.  K(x, y)  -  l/|x  _„p,  ,  =  153,  p  =  90 
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N 

^aiq(s) 

Trun(s) 

i  2~dzr(s) 

e2 

Ex 

200 

0.008 

0.010 

0.019 

0.019 

0.13524E-03 

0.87697E-03 

400 

0.016 

0.029 

0.045 

0.076 

0.20754E-03 

0.1146SE-02  ! 

800 

0.029 

0.058 

0.087 

0.309 

0.26133E-03 

0.12042E-02  ! 

1600 

0.057 

0.126 

0.183 

1.344 

0.32551E-03 

0.26410E-02  : 

3200 

0.114 

0.245 

0.358 

5.368 

0.37247E-03 

0.34192E-02 

6400 

0.224 

0.475 

0.699 

21.788 

0.42360E-03 

0.35911E-02  i 

Table  7:  Highly  non-uniformly  distributed  particles.  K(z.y)  =  l/\x  -  y\.  s  =  15  p  =  9 
and  n  —  4.  y 


N 

T init  (5) 

Talg{s) 

Trun(5) 

Tdir(s) 

e2 

Eoc 

400 

0.065 

0.060 

0.125 

0.076 

0.59124E-07 

0.61426E-06 

800 

0.140 

0.139 

0.279 

0.315 

0.77114E-07 

0.11068E-05 

1600 

0.264 

0.413 

0.677 

1.336 

0.10049E-06 

0.97051E-06 

3200 

0.528 

0.834 

1.363 

5.439 

0.12151E-06 

0.12184E-05 

6400 

1.052 

1.867 

2.919 

21.761 

0.15353E-06 

0.15668E1-05 

Table  8:  Highly  non-uniformly  distributed  particles.  K{x,y)  =  l/\x  -  y|,  s  =  61,  p  =  36 


N 

Tinit(s)  |  Tal0(s) 

Trun(s)  I  Tdir(s) 

E? 

•Eoc 

800 

1600 

3200 

6400 

0.805 

1.716 

3.334 

6.540 

0.250 

0.603 

1.769 

5.366 

1.055 

2.319 

5.103 

11.906 

0.314 

1.338 

5.442 

21.810 

0.40445E-11 

0.61795E-11 

0.88132E-11 

0.11716E-10 

0.87339E-10 

0.75092E-10 

0.85507E-10 

0.12124E-09 

Table  9:  Highly 
and  n  =  16. 


non-uniformly  distributed  particles.  K(x,y )  =  1/jx  -  y\,  s  =  153,  p  =  90, 
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1171  Se  x  s=r's^  co»p"-j  2Ta^r  «9  i?ir 
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tf?V)(s) 

-i: 

a&ri.ifrM)  , 

am  a[t)iu 

(137: 

A'C1(<r)(Sl 

=  f.p. 

fL  «2*,(tl(7 W)  . 

Jo  dN(s)dN(t)  a^dt- 

(13$) 

JC;J(a)(s) 

=  f-p- 

fL  &*«t)(y(s))  ,  . 

/ o  dN(s)  dN(ty- a(  >  dt  ■ 

(139! 

*7“MM 

=  f.p. 

j 

fL  ^2^(t)(7(s))  , 

>c  dN(s)2  a^dt- 

(140) 

*,UMM 

=  f.p.j 

fL  &$-*(tyh{s))  . 

o  dN(s)2dN(t)  C{t)dU- 

(141) 

=  fp  -j 

fL  d%(t)(7(s)) 
c  <9!V(s)3 

(142) 

respectively . 


fuor)nP)  0bm.°hUSly:.  the  °Perat°rs  K^,  K°- 2,  ^0-3,  Jfi.2  52uen  6  ,Ae  /ormiiiae  ^ 

^  ~F  «h  ^  the^r^°fthe  °Perat°rS  *$  ’  K$°’  *7*°>  defined  by  (134)  ~  (136) 
(139).  Furthermore,  K\\  defined  by  (138),  is  self-adjoint.  1  J’ 


4  Proof  of  Results 

Rrsfwe  conLT  7™  the  7  "'tS  fr°m  SCCti°n  2'  The  outlil,e  of  tiis  **tion  *  «  follows- 
T  !  *  consider  the  case  where  7  is  a  circle.  We  provide  the  proof  for  Theorem  2.6.  In 

case  where  TisTcircfe ^The^b  ^  integral  operators  (134)  ~  (140)  for  the 

case  vnere  7  is  a  circle.  Then,  by  combining  Theorem  2.6  and  Lemma  4.2  we  immediately 

Theorem^  ^  C°ndl“0I“  for  the  °»erat°rs  («)  -  (25)  on  a  circle.  These  are  stated  in 
s,nce“e 


Proof  of  Theorem  2.6  Since  the  proofs  for  the  identities  fsrn  0  \  *  j  . 

of  variables.  We  choose  the  parametrizafen  ““  "  *  Simple  *r“stel^ 


7 (t)  =  (cos(t),sin(t)) , 


(143) 
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where  t  €  [-; r,r].  It  immediately  follows  from  (143)  that 


d2%(t)(7(s)-h-N(s))  . 

dNW~ - G  dt  = 

_  fr  1  ~  2  ■  (1  —  h)  •  cos(t  -  s)  -+  (1  -  h)2  •  cos  (2  ( t  -  s))  ik, 
J-*  (1  -r  (1  -  h)2  -  2  •  (1  —  h)  •  cos(t  —  s))2 

=  -<** .  ['  1  -  2  •  (1  -  h)  •  cos (<)  4-  (1  -  h)2  •  cos(2t) 


(1  -i-  (1  —  h)2  —  2  •  (1  —  h)  •  cos(t))‘ 


■  elkt  dt  . 


efferthelutaitulion^  °f  the  •»“«“*  I144*'  To  .Us 


converts  (144)  into 


1  2  •  (1  h)  •  cos(t)  +  (1  —  h)2  •  cos(2t) 


(1  ~  (1  ~  h)2  —  2  •  (1  —  h)  •  cos(f))2  '  gl  tdt  = 
elks  ■  [  Z if  1  ~  (1  -  &)  (*  +  *-1)  +  |(1  -  h)2  (z2  -  z~2; 

'I*-1  2  V  (1  ~s-  (1  —  ^)2  —  (1  —  h)  {z  +  z-i ))2  : 


=  e'*5 


and  after  simple  algebraic  manipulation, . we  get 


zi  A  -  (1  -  &)  (f  +  z-1)  +  }  (1  -  h)2  (z2  +  z~2) 
2  V  (l  +  (l-h)2-(l-h)  (z  +  z-1))2 


iz*"1 


((l-/l)-z)2  (2(l-h)_i); 


■**dz,  (146) 


Substituting  (147)  into  (146),  we  obtain 

r  ^7w(T(^)-h-^(s))  ikt 

J-T  dNifp  e  dt  - 

—  eiks  ■  f  -  ( _ izk~x  izk~ 1  ^ 

'W-*2  V  ((1  —  h)  —  z)2  ~  (z"(l  —  h)  —  l)2y  dz '  ^ 

:  ov,  formula  (53)  for  r  _  1  follows  by  applying  a  standard  residue  calculation  to  (148).  □ 

Wh'l  F°rrn‘fae  (50)  ~  (52)>  (57)  ~  (38)  follow  from  well-known  results  (see  for  example 
111,  3J).  While  the  derivation  of  53)  -  (56)  (5Q)  -  (fu )  .•*  •  JL  f  y  examPle 

find  them  in  the  literature.  ’  '  '  <  4>  ^  Sm',ar’  thc  /*  *> 

The  operators  K)'°,  K^°,  K 3’°  K1'1  K2>x  ft'O.i  ^0,2  vo  3  zri  2  j  ^  , ,  , 

f0™  °“  ^  C“C’  TL’fo^>^  Mows  Jjfateiy  from 
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Lemma  4.2  Suppose  that  7  is  a  circle  of  radius  r  parametrized  by  its  arclength  with  external 
unit  normal  denoted  by  N.  Them  for  any  sufficiently  smooth  function  c  :  ^rr  !  C 

(a)  K™(c){s)  =  r7_^df___5n 

7  7_rr  2r  (149) 

lb>  KA°M  =  f.p.  r  (A- _ — 2 - )  a(()di 


=  ttt  1  a0  - -ff(cr')(s) , 


2r2  '  2 r2  cos(^)  -  2 


W  *?>)«  =  f,p Ll._ 


r  2r3  cos(^)  —  2r3 

=  -2  *rr-250  —  3  !rr~l  . 

_  rr 


a(<)  df 


w  (i52) 

w  ,153, 

W  ^‘WW  -  ^£2 -^)M.  (.54, 


cr(i)  dt 


(9)  K^{a)(s)  =  fp  r  (-Lx  1  ^ 

p  J-TT  \2r2  +  2r2  cos(^)  -  2r2  J 

=  77  r_1  ^0  4-  7r  H(a')(s) .  (155) 

^  ^7  (a)(s)  _  f.p.  ^;3  cos(^j  _  2 r3  *  =  * r"1  H (</)(«) ,  (156) 

W  KL(a){s)  =  f.n  3  \ 

J--r  \  r3  2r3  cos(l^)  -  2r3J 

=  ~ 2<rr-2d0  —  3  7rr_1  H(a')(s) ,  (157) 

wdere  if  denotes  fAe  lifter*  transform  (see  (130)  in  Section  3.3). 

The  following  theorem  is  an  immediate  consequence  of  Theorem  9  fi  anrl  T 

summarizes  the  so-called  jump  conditions  for  the  integrals  (12)  ^2q\  n 

where  T  is  a  circle.  integrals  (12)  -  (29)  on  the  boundary  T, 

LZZZenZfZTZlr  7“  "  fZd  ?"***  *»  *  *  ■»  -.enor 

^  ZL  7ZVZ  Z  C‘es  ,he  m,bcrt  tmns,orm  (130h  Thcn’ 

W  KtfWM  =  -JraW  +  ^W(er)Wi 


(157, 
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0>) 

(c) 

(d) 

(e) 

ff) 

( 9 ) 
(h) 


(i) 


Ki:°e(°)(s)  =  ~e(s)  +  K},’0{a)(s) 
K™A°)(s)  =  *r~la(s)  -  K2-°(a)(s) 


K7°e(a)(s)  =  -7T  T  1  0-(5)  -i-  iir^'°(cr)(£)  . 

Ki°\^){s)  =  -2?r-2ff(s)  -p  TTo"{$)  -r  Kf’°(o')(s)  . 

Ky°e(v)(s)  =  2ttt~2o(s)  -va"{s)  +  K™(o){s). 

•^7,1  (cr)(5)  =  ^ro'(s)  4-  (iCj’°)*(a)(s) . 

•^7,’e((7)(5)  =  ~ K  cr(s)  +  (K*’°)*(cr)(s)  , 

<i(^)(5)  =  K£(*){S)  =  K^(a)(s)  =  -*H(a')(s) 

Ki\(.°)(s)  =  -7ra"(s)^ir72-1(a)(5); 

=  Ti- a  "(s)  +  K^1(a)(s), 

K*  (*)(*)  =  -^r-1a(s)  +  K°’2(a)(s), 

^°,’e(ff)(s)  =  ^r~l  cr{s)  +  K^2{a){s) . 

^.iWW  =  *  o"{s)  +  (K2*)'{c){s), 


(159) 

(160) 
(161) 

(162) 

(163) 

(164) 

(165) 

(166) 

(167) 

(168) 

(169) 

(170) 

(171) 

(172) 

(173) 

(174) 


K%W(8)  =  -‘*°"{s)  +  (K%L)*(a){8), 

■^7,i(a)(s)  =  2Kr~2cr(s)  -  ira"(s)  +  (K*'°)*(o)(s) , 

=  -2-kt~2c{s)  -4-  it cr"(s)  +  (K^,0)*(a)(s) .  Vi„, 

f„n'::S:VT0Ce*d  t0  the  C3T  WhCTe  7iSan  arbitrar7  sufficiently  smooth  Jordan  curve  The 

S  ^  f0Und  "  m°St  dementary  t6Xtb00kS  °n  diferentiaI 

Lemma  4.4  Suppose  that  y  :  [0.L]  -+ TR2  is  a  sufficiently  smooth  Jordan  curve  parametrized 

%  f  QrC!T  mtk  thG  Umt  n°™1  and  ^  tangent  vectors  at  yZdZZt 

‘  an,.  T  S  ’,  TesP*ctwely-  Then>  there  exist  a  positive  real  number  a  (dependent  on  y)  and 

7?;T  d,ffcren“Me  !nnctions  f’S  ■  (-“•«)  -  R  (dependent  on  y),  such  that  for 

7  iS  +  '  7(S)  =  (‘  +  (3 ' m)  ■  TM  -  + t*  •  S(t)j  .  N(3) ,  (175) 

c  “  <t7S>  i$  ,he  ^  -  »*  **  7W. 


I/(«)I<IIt"WII. 

Is(*)l  <  \hm(s)\\. 


(176) 

(177) 
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In  the  local  parametrization  (175).  the  potential  of  a  quadrupole  located  at  and  oriented 
in  the  direction  A  (s)  assumes  a  particularly  simple  form,  given  by  the  following  lemma. 

Lemma  4.5  Suppose  that  7  :  [0,1]  -  TR2  is  a  sufficiently  smooth  Jordan  curve  parametrized 
y  i  s  arclength.  Then,  there  exist  real  positive  numbers  A.  a  and  h0  such  that  for  any  s  £  f0.  L 

:  oo  -x-  ,  ,  , 


<3Ar(s  +  t)2  {h^Tt^f 


h2-t2  cht 2  (5h~  —  t2)  j 


(h2~tiy 


'<  A. 


(17Si 


fpo,ntMsl  '  °,a)'  °  -  k  <  h’  Whcre  tke  «.  *"  <I7S>  «  curvature  0/  ,  tt£ 

,  =?l1ttjtnh0Ut  g“cralit)’’  11  is  3llfficlent  to  prove  the  lemma  for  the  case  where 

X  -  (0,4  w7„bt“n  7  '  =  (1’0)-  SUbStit“,ing  <175)  int0  (9)  “d  the  resol,  a, 


d2%w(x) 


Po{h.  t) 


dN(t)2  (h^  r  t2  -f  r(h.  t))2  ' 

where  p0.  r  :  1R2  -4  ]R  are  functions  given  by  the  formulae 

h-t  +  cht+cA-^  +  3ht!  (/(„,.  s(())  _  2 13  (2  m  _  J(t)) 


(179) 


Po(h.t)  = 


(/(*)  +  M<>)  +  h  t3 (/'«)  -  p'(t>)  -  t<  (/'(f)  -  „'(,))  -  3 (/(,)’  +  <,(,)*) 

_T(/,(t)  ■i-»'(,))-t' /(!)(/'(«) -»'W)-«e «(«)(/'(()  +  ,'«)’ 


h-t-cht-r 


C  t 2  c2  f 3 


+ 


2  ■  3A(2(/«)-s(0)-2!s(2/(()+s(i)) 

-— (/«)  -  5j«))  +  *!3 (/'(*)  -  ff'(0)  +«' (/'(()  e- <,'(»))  +  3  (5(/«)J  +  p(()2) 
~T  (/,(‘)  -  «'('))  -  (6 /(«)(/'«)  +  s'(0)  -  teg(t)(/'tt)  -  ,'(())" 


r(M)  =  -oAt2  -2ht*g(t)  +  +  2t*/(t)  +  cfsp(t)  +  t6(/(t)2  +  s(t)2)  . 

We  also  introduce  the  notation 

p, (A, f)  =  (h2  +  ,3  +  r(A ,„)s  -  (f,2  +  e) 2  =  2 (h2  +  (2)  ,r(A >()  +  r(A ()2 

SteSeaw,"  ‘nglr  combinations  of/,  p,  /',  t.  aad  ft,  Md  „ 

formulae  (180)  -  (182)  .mmedtately  shows  ,ha,  there  exist  posittve  real  numbers  a,  h0,  and 


(180) 

(181) 

(182) 
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C  (dependent  on  7)  such  that 


\Po(h,t)  -  h2  ~t2  -  3cht~\  <  C[h~  —  t2}2 


jpo(h.t)  ■  pi(h,t)  -  2cht2  {h2  -i- 12)  (h2  —  t2)j  <  C  (h2  -  t 

\po{h.t)  •  pi[h.t)2\  <  C(h2-t 2'6 

Pl(h-t)  j  i 

(h2  +  t2)2;  • 

for  all  h  <h0.te  (-a, a).  Substituting  (182)  into  (179).  we  have 


d2$7(t)Qz 
dN(t) 2 


[h2  +  f2)2(l 


go(h,  t) 


Pi(h.t) 

(/l2+J2)2 


E°2hll  Pi(h,t)k 

(A2 -fifes'  "  (V  +  ,2)“ 


here  the  convergence  of  the  series  follows  from  (186).  Combining  (183)  -  (185).  we  obtain 

gftf2(hft2  +  f2) 


dN(t)2  (h2-rt2)2  (h2  +  t2) 


(h2  +  t2y 


go  (h.  t )  •  Pl  (h.  t )  -  2  c  h  t2  (h2  +  t2)  (h2  -  t2)  ,  ~  p0(h.t).Pl(h.t)k 

(h2~t2)4  (h2  -r-  t2)2k~2 


<  2  C  +  C  • 


1  -  Q 


with  o:  defined  by  the  formula 


Now.  introducing  the  notation 


„  Pi(h.t) 

sup  - L. 

h<ho  ,  te{-a,a)  (h2  -f  f2)" 


^4  =  2  C  +  C 


1  -a’ 


we  obtain  (178).  □ 

Lemma  4.2  provides  an  explicit  formula  for  the  operator  K2-0.  defined  in 
case  when  7  is  a  circle.  The  following  lemma  shows  that  the  operator  K 2’°  on 
sufficiently  smooth  Jordan  curve  of  length  L,  is  a  compact  pertmbation  of  i^2  0 
°  ra  ius  2_.  Its  proof  is  an  immediate  consequence  of  estimate  (178)  in  Lemma 

Lemma  4.6  Suppose  that  7  :  [0,1]  -4  JR2  is  a  sufficiently  smooth  Jordan  curve 
by  Us  arclength,  and  that  V  :  [0,1]  -  R2  denotes  the  czrcle  of  radius  a/S0 


(135),  in  the 
an  arbitrary 
on  the  circle 
4.5. 

parametrized 

parametrized 
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flnctiond Thin  ^  SUPP°S£  that  a:l°-LJ^1R  w  a  twice  continuously  differentiable 


i.p.J 


dN(t) 2 


'<*>*  =  tp-  lo  —am*-  gw dt  +  ^  w  w 


where  Mo  :  c[0.  L\  ->■  c[0. 1]  is  a  compact  operator  defined  by  the  formula 

M2(a){s)  =  fL  ( **t(M*)) 

Jo  ^  dN(t )2  dN(tff  J  a^dt- 


1191) 


(192) 


Furthermore,  for  any  t  =  s. 


and  for  t  =  s, 


mffs.t)  =  gW)*7(j)-7(0)2  1  /2tt\2 

117(5)  —  7WII4  2  v  z  / 

^  117(5)  -  7(*)||2  -  2  (l  -  cos  (x(5  ~  <))) 
Il7(5)  -  7WII2  2  (^r)2  (1  -  COS  (lff{s  -  t))) 


7712 


<*•*>- 5  (*>)’- 5  (¥)’• 


(193) 


(194) 


CU"atUre  011  **  mnt  7(S)’  «*"■»  :  M  *  M  -  R  *  «.  kernel 


of 


n-Z'ZrV*0™  PI°VideS  *he  jump  conditions  for  the  operators  (14)  and 

(lo)  on  the  boundary  r,  when  T  is  sufficiently  smooth.  P  ’’  and 

Theorem  4.7  Suppose  that  7  :  [0,L]  -*  1R2  is  a  sufficiently  smooth  Jordan  curve  parametrized 
by  .is  arclength.  Then,  for  any  sufficiently  smooth  function  c  ■  (0  il  _  JR 

=  iim  fL  ( dd±iL a2*,w  (7 M-h-m 

*->0  Jo  \  dN(t )2  aTFTTTo - 


/0  V 

=  -2  7rc(s)cr(s), 


5))' 


dAT(«)2 


cr(f)  dt 


and 


(195) 


^72;0e(CT)(«)  +  tf72;?(<7)(s)  = 

=  lim  fL  (  d2%(t)h(s)  +  h  ■  N(s ))  <92$7(f)(7(s)  _  h  .  N(s)y 

^-oJ0  (  dN(t)2  +  ^2 - 

=  2-fp  [L  ^*7(0  (7(*))  ,  . 

P7o  8J\T(f)2  CT(*)*> 


a(f)  dt 


(196) 
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where  ^  denotes  the  curvature  of  y  at  7(5).  In  other  utords,  the  guairuvle  layer  potential 
unth  density  o  (see  (6);,  can  be  continuously  extended  from  !5  to  !7  and  from  E2  \  toR-  \  o 
with  the  limiting  values  given  by  the  formulae 

=  K*:°i(a)(s)=*c(s)a(s)  +  £p. 

Jo 


-  K^(o) (s)  =  -x c(s)  <r(«)  7  f.p.  £  — **i>(^W>  c(t)it .  (198) 

pro^rc:  “r:  osf7ad°  “v  * *■ We  b;r  b>- 

arclength.  We  define  the  functions  E*,  E*  :  [O.X]  x  [0,1]  -  JR  via  the  formlT  >' 

£*(s,i)  =  a2^7(0(7(^)  +  /i- iV(s))  -a.  jy(a)) 

5iV(t)2  '  ajv(i)2  ’  (199) 

+  d2*„(o(t?(s)-A-A'(s)) 

W(*)2  T  aF(ij2 - ■  (200) 

and.  substituting  (199),  (200)  into  (196),  obtain  the  identity 

+  <?(*)(»)  =  Um^  E‘(s,  i)  <r(t)  *+ Urn  £  (e*(4,  t)  -  E*(a,  t))  ff(t,  * .  (20l) 
Substituting  (160),  (161)  in  Theorem  4.3  into  (201),  \ye  have 
K™(<r)(s)  +  K*?i{<r)(s)  = 

-or  fL  d2®v(t)(v(s)}  rL  / 

PVo  — A-(,)2  P(0*+lim|  (E?(s,t) -£»(»,«))<,«)*.  (202) 

Due  to  Lemma  4.5,  there  exist  positive  real  constants  C„,  o,  and  A,  such  that  for  any  s  6  (0,  L\ 

S7(s,t)  -  E*(s,t)|  <  Co  ,  (203) 

a^J  r  ^  9  —  ^  <  h0.  For  any  t  ^  s  and  sufficiently  small  h  both  12h(s  t)  and  Yh( *  +\ 

are  functions.  Therefore,  there  also  exist  posittve  rea/constanti  h^t ££  fofan’ 

^(■M)  —  2^(s,t)|  <  Ci ,  (204) 

for  all  |  t-  s\  >  a  0  <  h  <  hv  Now,  applying  Lebesgue’s  dominated  convergence  theorem  fse* 
for  example,  [18])  to  the  second  integral  of  the  right  hand  side  of  (202),  we  obtain  (  ’ 


5)  ~  KyH(<r)(s)  =  -irc(s)er(s)  +  f.p. 


dN(t) 2  " 

d2*7(t)(7(s)) 


a{t)  dt , 


c(t)  dt . 


Sjfs.l) 


=  2  •  f.p. 


tj0  {^(n,t)-s;(n,i))CT(t)rft  = 


=  2- 


dN(t): 


dN(t): 


cr(t)  dt . 
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(207] 


U92)m  Lemma  i.6^  lmmediately  foI1°WS  from  the  combmaaon  of  (202).  (205)  with  (191,. 
TR  vkTthe formulae  ^  Pr°VmS  f°rmUla  (195)'  W*  define  the  functions  ^  L)  x  [0.  L\  - 

Aj(i  t)  =  d2g7(t)(7(s)  ^_h  ■  N(s))  d2^{t)(y(s)-h-N(s)) 

dN(t)  2  - •  (206) 

A J(s.f)  =  t  h  •  -  ft  •  Ms)) 

dN{t)2  dN(t) 2 

and.  by  substituting  (206).  (207)  into  (195),  obtain  the  identity 

*?:JWW  -  ir^WW  . 

=  Cj£r^J0  ^MMt)di+ lim^  (A?(.,t)  -  Ml .  A»M>)  <r(i)  dt. 

(208) 

Substituting  (160).  (161)  in  Theorem  4.3  into  (208),  we  get 
(*)(«)  -  K™(°)(s)  = 

=  — 2  ir  c(s)cr(s)  +  lim  ^  (A*(s,  t)  —  -^1^  •  A*(s,/)]  <j(t)dt.  (209) 
Due  to  Lemma  4.5,  there  exist  positive  real  constants  C„,  u,  and  (t0  such  that  for  any  s  6  (0,  LJ 

<  C0  ,  (210) 


|a;(s,o-^'A!(S,o 


^^ssssassisasasas 

l^r  ,  C(S)  L  .  I 

-Cl’  (211) 


a^,o-^.a,V,o 


theorem  (see,  for 

(^M)  -  ^  A‘(M,)  <*)*  -  [ Jim  (a*(M)  -  .  Ajfs.o)  *>*. 

(212) 


Examining  (206),  (207),  we  obviously  have 


te(*i(M>-*^-A*(M)).o. 


(213) 

d“o’  the  integIal  00  the  righ*  hand  Side  0f  <212>  *  -o.  from  which  (195)  foUows  imme- 
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5  Generalizations 

We  have  presented  explicit  (modulo  an  integral  operator  with  a  smooth  kernel)  formulae  for 
miegro-pseudodiflerential operators  of  potential  theory  in  two  dimensions  (up  to  order  2).  The 
work  presented  here  admits  several  obvious  extensions. 

a.  Formulae  (89)  -  (107)  have  their  counterparts  for  elliptic  PDEs  other  than  the  Laplace 
equation.  Indeed,  for  any  elliptic  PDE  in  two  dimensions,  the  Greems  formula  has  the  form 

G{x,  y)  =  d>(x,  y)  •  log(||x  -  i/||)  +  ^( x ,  y) ,  (214) 

with  ffi.  ip  a  pair  of  smooth  functions;  derivations  of  Section  4  are  almost  unchanged  when 
l°g(;!i  y  j)  is  replaced  with  (214).  In  particular,  the  counterparts  of  the  formulae  (89)  -  (99', 

rMWQof^°ltZ  equatl0D  (with  ei^er  real  or  complex  Helmholtz  coefficient)  are  identical  to 
(89)  -  (99);  the  counterparts  of  the  formulae  (100)  -  (107)  for  the  Helmholtz  equation  do  not 
coincide  with  (100)  -  (107)  exactly;  instead,  they  assume  the  form 

(a)  Ky  °(a)(s)  =  —2  it  (c(s)j  c{s)  +  k2  a(s)  -f  n a"(s)  —  2~  c'(s)  H(a) (s) 

—3  n c(s)  H(a')(s)  +  A3 (<r)(s) ,  (215) 

K*?e{<r)(s)  =  2?r  (c(s))  CT(-s) -4:rfc2a(s)  -  7ro-"(s)  -  27rc'(s)if(a)(s) 

—Z~c(s)H(a')(s)  +  N3(a)(s) ,  (216) 

Un  K-r,i(a)(s)  =  —4  irk2  cr(s)  —  t:  a"{s)  +  7T c'(s)  H(u)(s)  +  it  c(s) 

+G3((7)(S),  (217j 

KyMa)(s)  =  4vk2a(s)  +  TTa"(s)  +  irc'{s)H{a)(s)-~T:c(s)H(c,){s) 

-rG3(cr)(s),  (218) 

(c)  Ky.  (a)(s)  =  4tt  k2  cr(s)  4-  n  a"(s)  +  tt  c(s)  H(a')(s)  -f  G3(a)(s) ,  (219) 

i^;2(a)(S)  =  -^^<r(3)-ira,,(s)  +  irc{B)H(a,)(s)  +  G3((T)(8),  (220) 

(d)  Kyi{a)(s)  =  2tt(c(s))  <?(s)  -  4  k  k2  c{s)  -  t  o"{s)  ~  tt  c'{s)  H(er)(s) 

— Znc(s)  H{cr’)($)  +  N3(a)(s) ,  (221) 

^y,e(cr)(s)  —  ~^7r  (c(s))  &(s)  +  4  k  k2  cr(s)  ■+ 7r  a"(s)  —  ir  c'(s)  If(<7)(s) 

-3  7rc(s)  E(a')(s)  +  N3(cr)(s) ,  (2 22 ) 

Ire^ompta  *  ^  Hehnh°ltZ  coefficient’  “d  the  operators  N3,  G3,  N3)  G3  :  L2[0,  L ]  -4  Z2[0.  L] 

b  The  derivation  of  the  three-dimensional  counterparts  of  formulae  (89)  -  (107)  is  completely 
prX.LT  S  eXPreSSi0DS  have  b~  °b‘“ed'  ,h*  pX  reporting  lmP  1st 
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sr^r .•rzrsrjrr.sss 

of  this  type  are  currently  under  investigation.  Pronlems 
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We  introduce  a  new  version  of  the  combined  field  integral  equation  (CFIE)  for  the  solution 
of  electromagnetic  scattering  problems  in  three  dimensions.  Unlike  the  conventional  CFIE, 
the  version  reported  here  is  well-conditioned.  While  we  use  a  standard  magnetic  field 
integral  operator,  we  precondition  the  electric  field  integral  operator,  converting  it  into  a 
second-kind  integral  operator;  the  resulting  CFIE  is  an  integral  equation  of  the  second  kind 
that  has  no  spurious  resonances.  We  also  report  numerical  results  showing  that  the  new 
formulation  stabilizes  the  number  of  iterations  needed  to  solve  the  CFIE  on  closed  surfaces. 
This  is  in  contrast  to  the  conventional  CFIE,  where  the  number  of  iterations  grows  as  the 
discretization  is  refined. 
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1  Introduction 

Recent  progress  in  the  construction  of  “fast”  methods  for  the  solution  of  the 
boundary  integral  equations  of  scattering  theory  [1]  has  vastly  increased  the  size 
of  tractable  problems  [2,  3];  it  has  also  increased  the  need  for  well-conditioned 
boundary  integral  formulations.  There  are  two  principal  reasons  for  this: 

•  Since  we  have  sparse  decompositions  of  the  integral  operators  of  scat¬ 
tering  theory,  but  not  their  inverses,  we  employ  iterative  solvers.  Well- 
conditioned  systems  of  equations  can  be  solved  with  few  iterations. 

•  Using  a  fine  discretization  to  resolve  source  variations  or  geometric  de¬ 
tail  on  a  subwavelength  scale  results  in  an  ill-conditioned  linear  equation. 
This  is  sometimes  called  the  “low  frequency”  problem  in  computational 
electromagnetics. 


Onlj-  second-kind  integral  equations  (see  Appendix),  or  objects  with  similar 
spectral  behavior  (such  as  appropriately  preconditioned  differential  equations) 
can  be  solved  with  fully  controlled  approximation  error.  The  correct  operators 
are  the  sum  of  a  constant  (or  at  least  well-conditioned  and  easily  invertible) 
operator  and  a  compact  operator. 

Boundary  integral  operators  of  scattering  typically  violate  this  requirement 
in  one  of  three  ways: 

•  The  spectrum  may  accumulate  at  zero.  A  typical  example  is  the  first-kind 
integral  equation  for  the  scalar  Dirichlet  problem  (used  for  2d  electromag¬ 
netic  scattering  calculations  in  TM  polarization), 

•  the  operator  may  have  an  unbounded  spectrum,  such  as  a  pseudodifferen¬ 
tial  or  hypersingular  operator, 

•  the  operator  may  have  small  eigenvalues  associated  with  resonances,  often 
unphysical;  the  latter  are  often  referred  to  as  “spurious  resonances”  (see, 
for  example,  [4]). 

For  electromagnetic  scattering  from  perfectly  electrically  conducting  (PEC) 
surfaces,  the  standard  boundary  integral  equations  are  the  electric  field  integral 
equation  (EFIE)  ° 


-n  x  E‘  =  T3 

and  the  magnetic  field  integral  equation  (MFIE) 

ZnxHi= Q+ff) j- 


(1) 

(2) 
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where  the  integral  operators  T  and  K  are  defined  (as  in  [5])  by* 

TJ  =  T{k) 

=  ikn{x)x  j^ds'  |G(^x,x')J(x)  +  ^V[VG(fc,x,x')-J(x')]|(3) 

KJ  =  K{k)J  =  -n(x)  x  J  ds' VG(k,x,x')  x  J(x'),  (4) 

where  V  denotes  differentiation  with  respect  to  x,  and  n  (x)  is  the  unit  normal 
to  the  surface  at  x. 

The  MFIE  is  a  second-kind  integral  equation.  Unfortunately,  this  equation 
is  suitable  for  an  unacceptably  small  class  of  electromagnetic  problems.  It  is 
inapplicable  to  open  surfaces,  becomes  ill-conditioned  in  the  presence  of  geo¬ 
metric  singularities,  and  suffers  from  spurious  resonances.  The  EFIE  has  both 
a  compact  piece  and  a  hypersingular  piece  (coming  from  the  double  gradient 
term).  One  can  eliminate  the  spurious  resonances  of  the  MFIE  by  adding  the 
EFIE  to  form  a  combined  field  integral  equation  (CFIE)  [6].  The  cost  of  do¬ 
ing  so  is  the  introduction  of  the  EFIE’s  hypersingular  piece,  which  spoils  the 
conditioning  for  fine  discretizations  (or  low  frequencies). 

Adams  and  Brown  [7,  8]  and  Kolm  and  Rokhlin  [9]  recently  observed  that 
a  hy  persingular  integral  operator  and  a  first-kind  integral  operator  are  ideal 
preconditioners  for  each  other,  in  the  sense  that  the  composition  of  the  two  has 
the  spectral  characteristics  of  a  second-kind  integral  operator.  In  this  letter, 
we  show  how  the  same  approach  can  be  employed  to  analytically  precondition 
the  EFIE.  In  fact  (as  was  implicit  in  a  result  of  Hsiao  and  Kleinman  [5]),  the 
electric  field  integral  operator  T  preconditions  itself. 

Two  issues  raised  in  [8]  are  important  for  the  successful  application  of  this 
idea  to  closed  bodies.  First,  only  the  local  (or  short  distance)  behavior  of 
the  preconditioner  is  important  for  asymptotic  conditioning.  Thus,  one  can 
precondition  the  EFIE  by  multiplying  it  by  an  electric  field  integral  operator 
corresponding  to  an  arbitrary  wavenumber,  real  or  complex;  if  the  wavenumber 
has  a  positive  imaginary  part,  one  avoids  the  introduction  of  any  additional 
resonances.  (Obviously,  if  the  EFIE  preconditioner  reproduced  the  MFIE  reso¬ 
nances,  then  the  CFIE  would  also  have  them.)  Second,  one  must  take  care  that 
the  discretization  of  the  product  of  preconditioner  and  preconditioned  operators 
preserves  the  correct  spectral  properties. 

In  this  letter  we  describe  well-conditioned  formulations  for  both  open  and 
closed  surfaces.  We  also  present  numerical  results  for  closed  surfaces  which 
demonstrate  the  advantages  of  the  new  CFIE  formulation  over  the  conventional 
CFIE. 

i  °^er  terms  follow  the  usual  conventions:  J  =  Z nxH  is  the  unknown  surface  current, 
Ez  and  Hl  are  the  incident  electric  and  magnetic  fields,  respectively,  Z  =  y/JT/1  is  the  wave 
impedance,  and  G(k,x,x?)  =  exp  (ikr)  /Attt  is  the  3d  Helmholtz  kernel  with  r  =  |x-x'| 
being  the  distance  separating  field  and  source  points.  Harmonic  time  dependence  e“iu/C  is 
assumed. 


2 


2  Preconditioning  the  EFIE  operator 

References  [8]  and  [9]  consider  integral  operators  constructed  from  the  kernel 
for  the  Laplace  and  Helmholtz  equations  in  2d.  They  observe  that  the  prod¬ 
uct  of  a  first-kind  operator,  constructed  from  an  undifferentiated  kernel,  and  a 
hypersingular  operator,  constructed  from  a  twice  differentiated  kernel,  has  the 
desirable  spectral  characteristics  of  a  second-kind  operator.  Since  the  EFIE  in¬ 
tegral  operator  T  has  both  of  these,  one  might  expect  that  the  composition  of 
two  such  operators  T2  =ToT  would  include  a  constant  operator  and  a  compact 
operator.  One  might  also  worry  that  the  product  of  hypersingular  components 
would  produce  another  hypersingular  operator.  It  is  easy  to  see,  however,  that 
the  rotation  operation  nx  in  the  definition  (3)  of  T,  which  annihilates  the  com¬ 
ponent  of  the  surface  vector  field  normal  to  the  surface,  also  ensures  that  the 
product  of  the  two  hypersingular  operators  is  identically  equal  to  zero.  Indeed, 
applying  the  hypersingular  component  of  the  second  T  operator  to  an  arbitrary 
tangential  surface  vector  function  f  (x')  produces  a  surface  gradient  function 

n  x  Vd>(x)  =  i  (nxV)^  ds'  VG  (*,x,x;)  ■  f  (x')  ,  (5) 

which  the  hypersingular  component  of  the  first  T  operator,  in  turn,  annihilates 
(for  closed  surfaces)  by  virtue  of  the  identity 


Vs  •  [n  x  V<£(x)]  =  0,  (6) 

with  V5  denoting  the  surface  gradient  operator  on  5;  identity  (6),  the  surface 
analog  of  the  3d  identity  V  •  [V  x  <p  (x)]  =  0,  can  be  found,  for  example,  in  [10], 
and  is  valid  for  any  sufficiently  smooth  function  <f>  on  S.  It  follows  immediately 
from  (3),  (5),  and  (6)  that  T2  behaves  as  a  second-kind  integral  operator. 

In  this  letter  we  investigate  in  detail  the  spectral  properties  of  the  EFIE 
and  MFIE  integral  operators  and  combinations  thereof  for  the  PEC  sphere,  a 
simple  3d  target  for  which  the  spectral  properties  of  these  operators  are  known 
analytically.  A  complete  set  of  basis  functions  on  the  surface  of  a  sphere  of 
radius  a  is  given  by  the  vector  spherical  harmonics  [11] 


x,„(».t’)=— (7) 

Uim  (fl,  9)  =  nx  Xim  (6,  <p) ,  (8) 

defined  here  in  terms  of  the  scalar  spherical  harmonics  Ytm  (0,<p). 

The  result  of  applying  T  and  K+  =  (K  +  \)  to  each  basis  function  is+  [5] 


T(k)f  \  —  /  —  (ka)  m  (ka)  U(m  ) 
U1  Uim  /  "I  JJ]  (ka)  (ka)  X/m  / 

fThe  MFIE  eigenvalues  in  [5]  contain  a  sign  error  which  is  corrected  here. 


(9) 
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and 


K+{k) 


Xim 

Uim 


1  _  /  i  1/  (ka)  Hi  (ka)  X,m 

!  1  -ih 


!  (ka)  E;  (ka)  Ulr 


(10) 


wh,ere /'  and  ^  are  Riccati-Bessel  and  (first-kind)  Riccati-Hankel  functions  of 
order  /,  and  k  is  the  wavenumber  associated  with  the  kernel  of  each  integral 
operator.  The  Riccati-Bessel  and  Riccati-Hankel  functions  are  defined  [12]  in 
terms  of  spherical  Bessel  and  Hankel  functions  ji  (x)  and  hj1^  (x)  by 


h(x)=  xji  (x) , 
Ei  (x)  =  xh\1]  (x) . 


(11) 

(12) 


Although  our  chosen  basis  functions  X,m  and  U/m  are  not  eigenfunctions  of 
tne  operator  T  ( k ),  they  are  eigenfunctions  of  T2  (k)  =  T(k)oT  ( k ): 

1-2  {k)  { u!” }  =  -•»>  (*<■>  « (*<*)  (*«>  m  (*o)  [  ^  ] .  mi 

The  operator  T2  (k)  has  a  bounded  spectrum,  since,  in  the  limit  of  large  Z 
its  eigenvalues  accumulate  at  (a  result  which  follows  from  the  asymptotic 

properties  of  j,  and  given,  for  example,  in  [12]).  However,  as  is  evident 
rom  (10)  and  (13),  the  operator  T2  ( k )  also  shares  resonances  (at  the  zeros  of 

2  (  ,7„0i  the  X'm  “odes>  and  at  the  ^ros  of  J,  (ka)  for  the  U(m  modes)  with 
the  MFIE  operator  K  ■  (k).  This  fact  is  also  evident  from  the  identity 


T2  (k)  =  K2  (k)  -  i  =  K~(k)  o  K+(k), 


(14) 


(where  AT  =K-\)  derived  in  [5],  Therefore,  although  T2  (k)  is  a  second-kind 
integral  operator,  it  is  not  a  suitable  component  of  a  resonance-free  combined 
tieid  integral  equation  for  closed  bodies. 

As  stated  earlier,  boundedness  of  the  spectrum  of  the  product  of  two  EFIE 
operators  (of  the  form  (3))  is  assured  if  they  have  the  same  short-distance  be- 
avior,  a  condition  that  does  not  require  the  two  operators  to  share  the  same 
wavenumber  (propagation  constant).  If  we  choose  EFIE  operators  with  differ¬ 
ent  wavenumbers,  T(ki)  and  T (k2),  we  can  simultaneously  obtain  a  bounded 
product  and  avoid  MFIE  resonances. 

The  following  analysis  indicates  that  ik  is  a  particularly  good  choice  for  the 
wavenumber  in  the  preconditioning  operator  (assuming  that  the  wavenumber  k 
is  real).  The  eigensystem  for  T  (ik)  o  T  (k)  on  a  sphere  is 


T(ik)oT(k) 


{ 


Wm 


u/K 


1  _  r  (ika)  E5  (ika)  J,  (ka)  ty  (ka)  Xtm  }  . 

/  1  h  (ika)  H,  (ika)  1]  (ka)  IH^  (Jfca)  U/m  J  ' 


It  is  straightforward  to  show  (given  the  properties  [12]  of  jt  and  h.(1))  that  the 
eigenvalues  of  T  (ik)  °  T  (k)  accumulate  at  j  and  -i  for  the  X/m  and  U,m 
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eigenmodes.  respectively,  and  that  T  ( ik )  o  T  ( k )  does  not  share  anv  resonances 
with  the  MFIE  operator  K+(k). 

Since  T  (ik)oT(k)  is  a  second-kind  integral  operator  (in  the  sense  described 
in  the  Appendix)  and  does  not  share  any  resonances  with  K+(k),we  are  finally 
in  a  position  to  write  a  well-conditioned  CFIE  operator.  The  simplest  form  of 
such  an  operator  is 


T  ( ik )  o  T(k)  +  aK+  (k) ,  (16) 

where  a  is  a  constant  to  be  chosen.  In  creating  this  CFIE  operator  we  have 
preconditioned  the  EFIE  part  before  adding  to  it  the  MFIE  part  (which  is 
already  a  second-kind  integral  operator).  The  same  applies  to  the  excitation 
side  of  the  equation.  The  resulting  CFIE  is 

-T  (ik)  (n  xE')+aZnx  H*  =  [T  (ik)  o  T  (k)  +  aK+  (A)]  J .  (17) 

The  eigensystem  for  the  CFIE  operator  (16)  is 

[T(ifc)or(fc)  +  atf+(!fe)]  j  **  J 

_  _  /  [J|  ( (i*o)  Ji  (ka)  -  iol,'  (to)]  fife  (ka)  Xlm  \ 

I  [J/  (ika)  H  (ika)  1)  (ka)  +  iah  (ko)j  (ka)  Uim  J  ' 

If  one  chooses  a  =  ±1  then,  as  a  function  of  the  argument  ka,  these  eigenvalues 
have  no  zeros.  For  a  =  +1,  they  circle  the  origin  of  the  complex  plane. 

Other  well-conditioned  CFIE  operators  can  be  devised,  for  example,  by  pre¬ 
conditioning  the  MFIE  part  before  combining  it  with  the  preconditioned  EFIE 
part.  We  have  investigated  two  forms: 

T  (ik)  oT  (k)  +  aK+(ik)  o  K+(k)  (19) 

and 


T  (ik)  oT(k)  +  anx  K+(ik)  o  n  x  K+(k).  (20) 

Our  experience  shows  the  numerical  behavior  of  all  three  CFIE  formulations  to 
be  similar. 

We  have  proven  the  CFIE  operators  in  (16),  (19)  and  (20)  to  be  second-kind 
and  resonance-free  for  spheres.  However,  given  that  the  asymptotic  behavior  of 
the  eigenvalues  on  a  smooth  surface  stems  from  the  short  distance  behavior  of 
the  kernel,  we  argue  (following  the  theorems  proved  in  [9])  that  the  asymptotic 
behavior  of  the  various  operators  on  spheres  should  also  obtain  for  any  closed 
surface  that  can  be  obtained  by  smooth  deformation  of  a  sphere.  The  numerical 
results  presented  in  Section  4  support  this  argument.  We  also  present  results  for 
a  cube,  which,  like  many  targets  of  practical  interest,  has  geometric  singularities. 
These  results  suggest  that  the  new  CFIE  formulations  should  be  well  conditioned 
for  a  wide  class  of  closed  surfaces. 
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3  A  Different  Form  of  the  Preconditioned  EFIE 
Operator 

There  are  several  ways  to  produce  a  Nystrom  discretization  of  the  product 
operator  T  (ki)  o  T  (k2).  The  simplest  and  most  straightforward  approach,  mul- 
tip  lying  the  discretized  representations  of  the  individual  operators,  can  lead  to 
numerical  difficulties.  The  reason  is  that  it  is  relatively  difficult  to  make  the  dis¬ 
cretized  representations  of  the  hypersingular  part  of  each  operator  sufficiently 
accurate  (especially  for  high-spatial-ffequency  eigenmodes)  to  numerically  effect 
the  cancellation  that  obtains  analytically. 

Effective  discretizations  of  T(ki)oT (k2)  can  be  obtained  either  by  discretiz¬ 
ing  the  product  operator  directly  or  by  reformulating  the  product  operator  to 
eliminate  the  product  of  hypersingular  operators.  We  have  not  implemented  the 
first  method  because  of  the  added  complexity  it  entails.  We  have  implemented 
the  second  approach  using  a  reformulated  product  operator  that  eliminates  all 
instances  of  hypersingular  operators.  A  short  derivation  of  the  reformulated 
equation  is  given  below. 

The  first  step  toward  obtaining  a  more  useful  form  of  the  product  opera¬ 
tor  T(ki)  o  T(k2)  is  to  separate  each  integral  operator  into  its  singular  and 
hypersingular  components.  Introducing  the  abbreviations 


we  write 


where 


II 

(21) 

T2=T(k2), 

(22) 

II 

. 

Py 

</> 

+ 

■*•1- 

(23) 

T2  =  ik2T2s  +  -Lt", 
k2 

(24) 

r»Jsn(x)  x  J^ds'  G{km,x,x')  J(x'), 

(25) 

Tm- J  =  (n(x)  X  V)  J  ds'  VG  (km,  x,  x')  -J  (x') . 

(26) 

The  product  operator  7\  o  T2  can  be  expanded  into  four  terms.  Each  of  the  two 
cross  terms,  7\  oT2  and  TfoTf ,  can  be  transformed  (by  Stokes’s  theorem)  into 
the  product  of  new,  single-gradient  integral  operators  on  S  plus  a  line  integral 
around  the  boundary  of  5.  The  term  formed  by  the  product  of  hypersingular 
integral  operators,  T?  o  Tf ,  reduces  to  a  fine  integral.  The  result  is  further 
simplified  by  noticing  that  two  of  the  three  line  integrals,  when  applied  to  J, 
can  be  combined  into  a  single  term  whose  argument  is  identical  to  the  incident 
electric  field  El  by  virtue  of  (1). 

The  next  step  is  to  reformulate  the  excitation  side  of  the  equation,  taking 
advantage  of  the  fact  that  the  incident  wave  obeys  Maxwell’s  equations.  By 
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applying  Stokes’s  theorem,  we  rewrite  the  term  Tf  [n(x')  x  E!'  (x')j  as  the  sum 
of  a  single-gradient  integral  operator  on  V’  x  E*  (x’)  and  a  line  integral  that 
exactly  cancels  the  line  integral  involving  E‘  on  the  other  side  of  the  equation. 
A  further  simplification  follows  from  Faraday’s  Law,  V  x  E  =  iu’n H. 

The  final  result  for  the  analytically  preconditioned  EFIE  with  reformulated 
integral  operator  product  is 


-  ihT?  (n  x  E‘)  -  (n  •  H*) 


=  o  if  +  1 7f  o  7*  -  klk2Tf  o  Ti  -  gif  o  if)  J, 

where  the  various  integral  operators  are  defined  by 


d>  =  n(x)  x  J  ds'  VG  (km,x,  x')  <j>  (x') , 

TmV  =  n(x)  x  f  ds'  n(x')  x  V'G  {km,x,x‘)  <p(x') , 
J  s 

T^i  =  j  ds'  VG(*m,x,x').  f(x'), 
rrnf  =  nW  •  J  ds'  VG  (km,x, x')  X  f  (x') , 

=  n(x)  x  J^ds'  G{km,x,x')f{x') , 

Trn4>  =  n(x)  x<f  dl'  G(km, x, x')  <j> (x') , 

JdS 


(27) 

(28) 

(29) 

(30) 

(31) 

(32) 

(33) 


with  m  =  1,2.  Note  that  T®,  T&,  and  map  scalar  functions  to  surface 
vector  functions,  whereas  and  do  the  reverse.  The  operator  on  the  right 
hand  side  of  (27)  maps  surface  vector  functions  into  surface  vector  functions. 

In  the  remainder  of  this  section  we  discuss  closed  surfaces  and  observe  that 
Tj  o  T2  behaves  like  a  second-kind  integral  operator.  For  open  surfaces,  the 
situation  is  somewhat  more  complicated  in  that  additional  analytical  machin¬ 
ery  is  required  to  convert  (27)  into  a  second-kind  integral  operator.  We  have 
performed  such  analyses  for  the  2d  and  3d  scalar  cases,  and  will  report  these 
results  in  the  future. 

If  S  is  a  closed  surface,  the  term  Tf  o  T/j  vanishes,  and  (27)  simplifies  to 
-ik'T?  (n  x  Ef)  -  Z^T?  (n  •  H*)  =  5  (klt k2)  J,  (34) 

where 

Si2  s  S (kuk2)  =  J-T?  O  T[  +  J±lf  O  if  -  klk2T?  o  Ti.  (35) 
We  note  several  features  of  S12. 
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First,  all  of  the  individual  integral  operators  that  comprise  S12  involve  ker¬ 
nels  with  one  or  no  gradients  on  the  Helmholtz  Green’s  function  G.  All  such 
integral  operators  are  bounded. 


Second,  the  eigenvalues  of  the  integral  operator  S12  do  not  accumulate  at  the 
origin.  We  will  demonstrate  this  by  examining  its  three  components  o  T2  , 
T\  °^2i  and  Tf  oT2.  The  operator  oTj  is  a  second-kind  integral  operator 
for  the  transverse  (divergence-free)  component  of  J,  and  is  identically  zero  for 
the  longitudinal  (irrotational)  component  of  J.  Likewise,  the  operator  if  o  Tf 
is  a  second-kind  operator  for  the  longitudinal  component  of  J,  and  is  identically 
zero  for  the  transverse  component  of  J.  Since  any  surface  current  distribution 
can  be  decomposed  into  longitudinal  and  transverse  components  [11],  the  sum 
if  TiQ  0  T2  +  feTi  °  T2  is  a  second-kind  integral  operator;  subtracting  ky  k2T?  o 
T2  >  a  compact  operator,  does  not  change  this  result.  As  observed  in  Section  2, 
we  can  avoid  resonance  sharing  by  setting  ky  =  ik  and  k2  =  k.  In  this  case,  the 
eigenvalues  of  S12  accumulate  at  two  points,  ±|,  rather  that  at 

Third,  the  spectrum  of  Si2,  after  discretization,  is  bounded  and  includes 
accumulation  points  at  the  expected  locations.  However,  an  accurate  discretiza¬ 
tion  will  have  zero  (or  very  small)  eigenvalues  wherever  the  EFIE  operator  T  (J fc2) 
has  a  resonance.  Thus,  it  has  to  be  combined  with  an  appropriate  discretization 
of  the  MFIE  operator,  to  obtain  an  effective  discretization  of  the  CFIE. 

Finally,  it  should  be  noted  that  (34)  is  manifestly  insusceptible  to  the  “low- 
frequency  problem  that  plagues  the  EFIE.  Since  the  well-conditioned  behavior 
of  S12  comes  from  the  composite  operators  o  T2  and  j fc-Tf  o  T2,  both  of 

whose  prefactors  have  modulus  unity  (assuming  \ky\  =  |*2|  =  k),  and  since  the 
term  kik2Tf  o  T2  tends  to  zero  as  k  ->  0,  the  full  operator  Si 2  remains  well 
conditioned  in  the  limit  of  low  frequency. 

In  summary,  although  the  operators  7)  o  J2  and  Si2  have  identical  spectral 
properties  for  closed  bodies,  it  is  easier  to  construct  an  accurate  Nystrom  dis¬ 
cretization  for  Si 2  because  it  is  composed  of  less  singular  integral  operators. 
Matrix  representations  of  Sy2  have  bounded  spectra,  but  also  suffer  from  spuri¬ 
ous  resonances  inherited  from  the  EFIE  operator  T  (k2).  These  resonances  can 
be  eliminated  by  combining  Sy2  with  K+{k2)  (or  the  modified  MFIE  operators 
in  (19)  and  (20)).  The  result  is  a  well-conditioned  system  of  linear  algebraic 
equations. 


4  Numerical  Results 

In  this  section  we  compare  the  numerical  performance  of  the  conventional  CFIE 
(referred  to  below  as  CFIE1) 

-n  x  (n  x  E')  +  Zn  x  IF  =  [n  x  T  {k)  +  if+(A:)]  J  (36) 
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with  the  preconditioned  CFIE  (CFIE2) 
kTs  0 ik )  nxE'  +  iZTa  (ik)  n-W-Znx  H' 

=  {i  [~Ta  (ik)  o  Tt  ( k )  +  7*  (ik)  o  (A)  -  k2Ts  (ik)  o  T5  (A)]  -  K+(k)}  J 

(37) 

produced  by  combining  (17)  (with  a  =  -1)  and  (34)  (with  kx  =  ik  and  k2  =  k). 
We  discretized  the  individual  operators  in  these  equations  using  a  high-order 
Nystrom  scheme  [13].  In  all  cases,  the  wave  impedance  Z  was  set  to  unity. 

We  present  three  examples.  The  first  example  shows  how  the  condition 
number  of  each  operator,  defined  as  the  ratio  of  the  largest  to  smallest  singular 
values,  depends  on  the  fineness  of  discretization.  Table  1  lists  the  condition 
number  (CN)  of  the  matrix  representing  each  CFIE  operator  as  the  size  of  the 
sphere  decreases.  In  all  cases,  the  same  discretization  was  used,  created  by 
placing  a  6-point  quadrature  rule  on  each  of  the  80  nearly  identical  patches  that 
cover  the  sphere,  for  a  total  of  960  unknowns.  As  the  sphere  radius  decreases, 
the  condition  number  for  the  CFIE2  integral  operator  stabilizes  at  about  2, 
whereas  the  condition  number  of  the  CFIE1  integral  operator  continues  to  grow 
in  inverse  proportion  to  the  radius. 


radius(A) 

CFIEl 

CFIE2 

1 

4.2 

3.04 

1/4 

15 

2.68 

1/16 

59 

2.04 

1/64 

230 

1.99 

1/256 

940 

1.97 

1/1024 

3800 

1.97 

1/4096 

15000 

1.97 

Table  1:  Condition  number  of  CFIE  matrices  for  shrinking  PEC  spheres 

The  second  test  compares  iterative  solver  performance  for  the  new  CFIE  and 
the  conventional  CFIE.  The  target  geometry  consists  of  two  PEC  spheres,  one 
with  a  radius  of  A/2,  the  other  set  at  a  resonant  radius,  namely,  the  first  zero 
ofJi  (27rr/A)  or  r  ss  0.43667457  A.  The  spheres  are  separated  by  a  A/100  gap. 
We  subdivided  the  patches  near  the  gap  by  a  factor  of  about  10  to  adequately 
resolve  the  currents,  which  vary  rapidly  there.  Table  2  compares  iteration  counts 
and  radar  cross  section  (RCS)  errors  for  several  discretizations.  The  iterations 
columns  list  the  maximum  number  of  iterations  a  conjugate  gradient  squared 
(CGS)  routine  required  to  reach  a  residual  error  of  10— A  solution  computed 
from  a  substantially  more  refined  discretization  provided  an  accuracy  reference. 
The  stated  error  is  the  root  mean  squared  (RMS)  value  of  the  difference  between 
the  monostatic  <p<p  RCS  of  the  comparison  solution  and  the  reference  solution  at 
181  angles.  For  identical  discretizations,  the  two  methods  had  about  the  same 
error.  The  data  show  a  dramatic  difference,  however,  in  the  iteration  count 
behavior  of  the  two  methods  in  response  to  discretization  refinements. 
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unknowns 

patches 

CFIEl 

CFIE2 

iterations 

error 

iterations 

error 

1496 

748 

60 

0.46 

9 

0.40 

4488 

748 

126 

0.18 

11 

0.18 

996 

498 

44 

0.61 

9 

0.48 

2988 

498 

103 

0.23 

12 

0.16 

5976 

498 

163 

0.016 

11 

0.029 

Table  2:  Iteration  count  and  solution  error  vs.  discretization  for  two  PEC 
spheres. 


The  third  test  also  compares  iterative  solver  performance  for  the  two  CFIE 
formulations.  In  this  case  the  target  is  a  cube  of  size  1A.  We  present  numerical 
results  for  five  different  discretizations,  the  first  of  which  was  derived  from  a 
mesh  (i.e.,  a  set  of  patches)  obtained  by  dividing  each  face  into  four  squares. 
The  second  mesh  was  constructed  from  the  first  one  by  subdividing  each  square 
into  four  smaller  squares.  The  third  mesh  was  constructed  from  the  second 
by  subdividing  edge-touching  patches  in  half  along  a  line  parallel  to  the  edge; 
patches  adjacent  to  two  edges  (i.e.,  corner  patches)  were  divided  into  quarters! 
Meshes  for  the  fourth  and  fifth  discretizations  were  constructed  by  recursively 
applying  the  procedure  by  which  the  third  mesh  was  constructed  from  the  sec¬ 
ond.  This  process,  known  as  patch  tapering,  is  useful  for  resolving  the  source 
singularities  that  arise  in  the  vicinity  of  geometric  singularities.  It  also  puts 
stress  on  the  conventional  CFIE  because  points  near  edges  get  close  together. 
Table  3  lists  the  maximum  and  average  number  of  iterations  the  CGS  routine 
needed  to  obtain  solutions  for  92  independent  excitations  to  a  residual  error  of 
10  .  The  total  number  of  unknowns  is  the  Jesuit  of  using  a  9-point  quadrature 

rule  on  each  square  or  rectangular  patch.  The  iteration  count  for  CFIE2  grows 
very  slowly  with  increasing  taper  depth,  whereas  for  CFIEl  it  increases  steadily, 
in  accordance  with  expectations. 


unknowns 

taper 

CFIEl 

CFIE2 

depth 

max 

ave 

max 

ave 

432 

0 

12 

6.5 

10 

4.3 

1728 

1 

18 

9.9 

11 

4.9 

3888 

2 

26 

14 

11 

4.9 

6912 

3 

41 

23 

11 

5.5 

10800 

4 

58 

36 

13 

5.7 

Table  3:  Iteration  count  vs.  taper  depth  for  1A  PEC  cube. 
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5  Conclusions  and  Generalizations 

The  classical  electric  field  integral  operator  is  its  own  perfect  preconditioner,  in 
the  sense  that  applying  it  to  both  sides  of  the  EFIE  converts  the  latter  into  a 
second-kind  integral  equation.  When  the  preconditioned  electric  field  integral 
operator  is  used  as  a  component  of  the  CFIE,  the  latter  is  also  converted  into 
a  second-kind  integral  equation.  Furthermore,  if  the  preconditioning  electric 
field  operator  corresponds  to  a  complex  wavenumber,  the  resulting  CFIE  has 
no  spurious  resonances. 

In  this  paper,  we  describe  in  some  detail  an  improved  CFIE  for  electromag¬ 
netic  scattering  from  perfectly  conducting  closed  surfaces,  leading  to  a  signif¬ 
icant  improvement  in  the  performance  of  iterative  solvers;  incorporating  the 
approach  into  the  existing  “fast”  solvers  is  completely  straightforward.  The  re¬ 
sults  presented  here  admit  generalizations  in  several  directions.  The  extensions 
discussed  below  are  currently  under  investigation,  and  will  be  reported  at  a  later 
date. 

The  approach  of  this  paper  can  be  applied,  with  minor  modifications,  to 
surface  scattering  with  more  general  boundary  conditions.  The  extension  to  an 
interface  between  two  dielectrics,  for  example,  is  straightforward;  the  resulting 
operators  have  condition  numbers  that  are  in  fact  somewhat  lower  than  in  the 
case  described  here.  While  structures  consisting  of  several  dielectrics  do  not 
appear  to  present  serious  difficulties,  places  where  several  different  dielectrics 
come  in  contact  with  each  other  require  separate  analytical  treatment. 

The  approach  of  this  paper  has  to  be  modified  only  slightly  in  order  to  obtain 
second  kind  integral  equations  describing  electromagnetic  scattering  from  open 
perfectly  conducting  surfaces.  In  this  environment,  the  CFIE  is  replaced  with 
an  appropriately  preconditioned  EFEE,  and  the  edge  of  the  surface  requires 
separate  treatment.  The  result  is  a  pair  of  coupled  integral  equations,  one  on 
the  surface  itself,  and  the  other  on  the  edge  of  the  surface  (which  is,  obviously, 
a  curve  in  i?3).  At  this  time,  the  theory  has  been  constructed  for  the  scalar  case 
when  the  boundary  of  the  surface  is  a  sufficiently  smooth  curve;  the  analysis  of 
open  surfaces  whose  boundaries  have  corners  is  in  progress. 

A  Appendix 

The  standard  definition  of  a  second-kind  integral  operator  is  an  operator  of  the 
form 


M  (38) 

where  A  is  a  constant,  I  is  the  identity,  and  A"  is  a  compact  operator.  In 
scattering  theory,  one  encounters  operators  of  the  form 

AjPi  +  A2P2  +  K)  (39) 
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where  X1  and  A2  are  constants  and  Px  and  P2  are  orthogonal  projection  operators 
such  that 

Pi+P2=I.  (40) 

Operators  of  the  form  (39)  possess  most  of  the  desirable  properties  of  second- 
kind  integral  operators.  In  a  mild  abuse  of  terminology,  we  refer  to  such  expres¬ 
sions  as  second-kind  integral  operators  throughout  this  letter. 
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Polynomials  are  one  of  principal  tools  of  classical  numerical  analysis.  When  a  function 
needs  to  be  interpolated,  integrated,  differentiated,  etc.,  it  is  assumed  to  be  approximated 
by  a  polynomial  of  a  certain  fixed  order  (though  the  polynomial  is  almost  never  constructed 
explicitly),  and  a  treatment  appropriate  to  such  a  polynomial  is  applied.  We  introduce  anal¬ 
ogous  techniques  based  on  the  assumption  that  the  function  to  be  dealt  with  is  band-limited, 
and  use  the  well-developed  apparatus  of  Prolate  Spheroidal  Wave  Functions  to  construct 
quadratures,  interpolation  and  differentiation  formulae,  etc.  for  band-limited  functions. 
Since  band-limited  functions  are  often  encountered  in  physics,  engineering,  statistics,  etc. 
the  apparatus  we  introduce  appears  to  be  natural  in  many  environments.  Our  results  are 
illustrated  with  several  numerical  examples. 
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Prolate  Spheroidal  Wave  Functions,  Quadrature,  and 

Interpolation 


1  Introduction 

Numerical  quadrature  and  interpolation  are  a  well-developed  part  of  numerical  analysis: 
polynomials  are  the  classical  tool  for  the  design  of  such  schemes.  Conceptually  speaking, 
one  assumes  that  the  function  is  well-approximated  by  expressions  of  the  form 

n 

X>M  (i) 

j= 0 

with  reasonably  small  n,  and  designs  algorithms  that  are  effective  for  functions  of  the 
form  (1)  (needless  to  say,  one  almost  never  actually  computes  the  coefficients  {at};  one 
only  uses  the  fact  of  their  existence).  Obviously,  the  polynomial  approach  is  only  effective 
for  functions  that  are  well- approximated  by  polynomials. 

When  one  has  to  handle  functions  that  are  well-behaved  on  the  whole  line  (for  ex¬ 
ample,  in  signal  processing),  polynomials  are  not  an  appropriate  tool.  In  such  cases, 
trigonometric  polynomials  are  used;  existing  tools  are  very  satisfactory  for  dealing  with 
functions  defined  and  well-behaved  on  the  whole  of  R1.  Such  tools,  in  effect,  make  the 
assumption  that  the  functions  are  band-limited  or  nearly  so:  a  function  /  :  it  — >  R  is 
said  to  be  band-limited  if  there  exist  a  positive  real  c  and  a  function  a  €  L2\-l  H  such 
that 


(2) 

However,  in  many  cases,  we  are  confronted  with  band-limited  functions  defined  on  inter¬ 
vals  (or,  more  generally,  on  compact  regions  in  Rn).  Wave  phenomena  are  a  rich  source 
of  such  functions,  both  in  the  engineering  and  computational  contexts;  they  are  also 
encountered  in  fluid  dynamics,  signal  processing,  and  many  other  areas.  Often,  such 
functions  can  be  effectively  approximated  by  polynomials  via  standard  tools  of  classical 
analysis.  However,  even  when  such  approximations  are  feasible,  they  are  usually  not 
optimal.  Smooth  periodic  functions  are  a  good  illustration  of  this  observation:  while 
they  can  be  approximated  by  polynomials  (for  example,  via  Chebyshev  or  Legendre 
expansions),  they  are  more  efficiently  approximated  by  Fourier  expansions,  both  for  an¬ 
alytical  and  numerical  purposes.  It  would  appear  that  an  approach  explicitly  based  on 
trigonometric  polynomials  could  be  more  efficient  in  dealing  with  band-limited  functions. 

In  the  engineering  context,  such  an  apparatus  was  constructed  more  than  30  years 
ago  (see  [20]- [21],  [7]-[9]).  The  natural  tool  for  analyzing  band-limited  functions  on  R1  is 
the  Fourier  Transform,  unless  the  functions  are  periodic,  in  which  case  the  natural  tool  is 
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the  Fourier  Series.  The  authors  of  [20]-[21]  observe  that  for  the  analysis  of  band-limited 
functions  on  the  interval,  Prolate  Spheroidal  Wave  Functions  are  likewise  a  natural  ap¬ 
proach.  The  authors  also  construct  a  multidimensional  version  of  the  theory,  though 
their  apparatus  is  only  complete  for  the  case  of  spherical  regions. 

The  present  paper  constructs  tools  for  the  use  of  the  approach  of  [20]- [21]  in  the 
modern  computational  environment.  We  construct  a  class  of  quadratures  for  band- 
limited  functions  that  closely  parallel  the  Gaussian  quadratures  for  polynomials.  The 
nodes  are  very  close  to  being  roots  of  appropriately  chosen  Prolate  Spheroidal  Wave 
Functions,  the  resulting  quadratures  are  stable,  and  all  weights  are  positive.  As  in  the 
case  of  polynomials,  there  are  interpolation,  differentiation  and  indefinite  integration 
schemes  associated  with  the  obtained  quadratures,  exact  on  certain  classes  of  band- 
limited  functions.  These  procedures  are  the  main  tools  necessary  for  the  numerical  use 
of  spectral  discretizations  based  on  Prolate  Spheroidal  Wave  Functions,  instead  of  on 
the  usual  polynomial  bases.  When  dealing  with  band-limited  functions,  the  number  of 
nodes  required  by  these  procedures  to  obtain  a  prescribed  accuracy  is  much  less  than 
that  required  by  their  polynomial-based  counterparts.  An  additional  bonus  is  the  fact 
that  the  condition  number  of  differentiation  of  prolate  spheroidal  wave  functions  is  less 
than  that  of  differentiation  of  the  usual  polynomial  basis  functions  (see  Section  8  below). 

This  paper  is  organized  as  follows.  Section  2  summarizes  various  standard  mathemat¬ 
ical  facts  used  in  the  remainder  of  the  paper.  Section  3  contains  derivations  of  various 
results  used  in  the  algorithms  described  in  later  sections.  Section  4  describes  algorithms 
for  evaluation  of  prolate  spheroidal  wave  functions  and  associated  eigenvalues.  Section  5 
describes  a  construction  of  quadratures  for  band-limited  functions.  Section  6  describes 
an  alternative  approach  to  arriving  at  such  quadratures;  it  shows  that  roots  of  appropri¬ 
ately  chosen  prolate  spheroidal  wave  functions  can  serve  as  quadrature  nodes.  Section  7 
analyzes  the  use  of  prolate  spheroidal  wave  functions  for  interpolation.  Section  8  con¬ 
tains  results  of  our  numerical  experiments  with  quadratures  and  interpolation.  Section  9 
contains  a  number  of  miscellaneous  properties  of  prolate  spheroidal  wave  functions,  and 
Section  10  contains  generalizations  and  conclusions. 

2  Mathematical  Preliminaries 

As  a  matter  of  convention,  in  this  paper  the  norm  of  a  function  is,  unless  stated  otherwise, 
its  L 2  norm: 


=  Jj  l/(*)l2 


dx. 


(3) 
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2.1  Chebyshev  systems 

Definition  2.1  A  sequence  of  functions  4>i  will  be  referred  to  as  a  Chebvshev 

system  on  the  interval  [a,  6]  if  each  of  them  is  continuous  and  the  determinant 

I  0itei)  •  •  •  0i  (£«)  I 


I  ©«tel)  0nten)  I 

is  nonzero  for  any  sequence  of  points  Xi,...,xn  such  that  a  <  xx  <  x2  . . .  <  xn  <  b. 

An  alternate  definition  of  a  Chebyshev  system  is  that  any  linear  combination  of  the 
functions  with  nonzero  coefficients  must  have  fewer  than  n  zeros. 

Examples  of  Chebyshev  and  extended  Chebyshev  systems  include  the  following  (ad¬ 
ditional  examples  can  be  found  in  [11]). 

Example  2.1  The  powers  l,x,x2, ...  ,xn  form  an  extended  Chebyshev  system  on  the 
interval  (— oc.oc). 

Example  2.2  Tht  exponentials  e~x'x,  . . . ,  form  an  extended  Chebyshev  sys¬ 
tem  for  any  X-. . Ar>  >0  on  the  interval  [0,  oo). 

Example  2.3  The  functions  1,  cos  x,  sinx,  cos' 2x,  sin  2x, . . . ,  cos  nx,  sin  nx  form  a  Cheby- 
shev  system  on  the  interval  [0,27r]. 


2.2  Generalized  Gaussian  quadratures 

A  quadrature  rule  is  an  expression  of  the  form 

71 

J=1 


(5) 


where  the  points  ij?R  and  coefficients  Wj  €  JR.  are  referred  to  as  the  nodes  and  weights 
of  the  quadrature,  respectively.  They  serve  as  approximations  to  integrals  of  the  form 

[  o (x)w{x)dx,  /fi\ 


with  u  being  an  integrable  non-negative  function. 

Quadratures  are  typically  chosen  so  that  the  quadrature  (5)  is  equal  to  the  desired 
integral  (6)  for  some  set  of  functions,  commonly  polynomials  of  some  fixed  order.  Of 
these  the  classical  Gaussian  quadrature  rules  consist  of  n  nodes  and  integrate  polynomi- 
als  of  order  2n  -  1  exactly.  In  [13],  the  notion  of  a  Gaussian  quadrature  was  generalized 
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Definition  2.2  A  quadrature  formula  will  be  referred  to  as  Gaussian  with  respect  to  a 
set  of  2 n  functions  <j> i, . . . ,  02 n  '■  [o,  b]  — ►  R  and  a  weight  function  ui  :  [a,  6]  — »  R+,  if  it 
consists  of  n  weights  and  nodes,  and  integrates  the  functions  f>i  exactly  with  the  weight 
function  u>  for  all  i  =  1, . . . ,  2n.  The  weights  and  nodes  of  a  Gaussian  quadrature  will  be 
referred  to  as  Gaussian  weights  and  nodes  respectively. 

The  following  theorem  appears  to  be  due  to  Markov  [14,  15];  proofs  of  it  can  also  be 
found  in  [12]  and  [11]  (in  a  somewhat  different  form). 

Theorem  2.1  Suppose  that  the  functions  01;...,02n  ■  [a, 6]  -»  R  form  a  Chebyshev 
system  on  [a,  6].  Suppose  in  addition  that  u  :  [a,  b]  — >  R  is  a  non-negative  integrable 
function  [a,  5]  — >  R.  Then  there  exists  a  unique  Gaussian  quadrature  for  the  functions 
0i>  •  •  •  >  02n  on  [a,  6]  with  respect  to  the  weight  function  us.  The  weights  of  this  quadrature 
are  positive. 

While  the  existence  of  Generalized  Gaussian  Quadratures  was  observed  more  than 
100  years  ago,  the  constructions  found  in  [14,  15],  [6,  12],  [10,  11]  do  not  easily  yield 
numerical  algorithms  for  the  design  of  such  quadrature  formulae;  such  algorithms  have 
been  constructed  recently  (see  [13,  25,  2]). 

Remark  2.1  It  might  be  worthwhile  to  observe  here  that  when  a  Generalized  Gaussian 
quadrature  is  to  be  constructed,  the  determination  of  its  nodes  tends  to  be  the  critical 
step  (though  the  procedure  of  [13,  25,  2]  determines  the  nodes  and  weights  simultane¬ 
ously).  Indeed,  once  the  nodes  Xi,X2, .  ■ .  ,xn  have  been  found,  the  weights  W\  ,w2,...,wn 
can  be  determined  easily  as  the  solution  of  the  n  x  n  system  of  linear  equations 

-A  fb 

22wj-Mxj)  =  /  <Piix)  dx,  (7) 

j= i  Ja 

with  i  =  1,2, . . .  ,n. 

2.3  Legendre  Polynomials 

In  agreement  with  standard  practice,  we  will  be  denoting  by  Pn  the  classical  Legendre 
polynomials,  defined  by  the  three-term  recursion 

P^)  =  2-^fxPr.(x)-^-Pn^(x),  (8) 

with  the  initial  conditions 


Po{x)  =  1, 
Pi{x)  =  x ; 
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(9) 


as  is  well-known, 


at  i)  =  i 


(10) 


for  all  k  —  0. 1, 2, . . and  each  of  the  polynomials  P 'k  satisfies  the  differential  equation 

.  dPk(x) 


(1  -  x2) 


2\  d2Pk(x) 


dx 2 


—  2x 


dx 


+  k  ■  (k+1)  Pk(x)  =  0. 


(11) 


The  polynomials  defined  by  the  formulae  (8), (9)  are  orthogonal  on  the  interval  [-1,1]; 
however,  they  are  not  orthonormal,  since  for  each  n  >  0, 


_ 1 _ 

n  +  1/2’ 


(12) 


the  normalized  version  of  the  Legendre  polynomials  will  be  denoted  by  Pn,  so  that 

Pn{x)  =  Pn(x)  ■  01  +  1/2.  (13) 

The  following  lemma  follows  immediately  from  the  Cauchy-Schwartz  inequality  and  from 
the  orthogonality  of  the  Legendre  polynomials  on  the  interval  [-1,1]: 

Lemma  2.2  For  all  integer  k  >  n, 


xk  Pn{x)  dx 


< 


For  all  integer  0  <  k  <  n, 


(14) 


(15) 


2.4  Convolutional  Volterra  Equations 

A  convolutional  Volterra  equation  of  the  second  kind  is  an  expression  of  the  form 

<P(X)  =  Ja  K(x-t)ip(t)  dt  +  <y(x)  (16) 

where  a,  b  are  a  pair  of  numbers  such  that  a  <  b,  the  functions  a,  K  :  [a,  6]  €  are 

square- integrable,  and  <p  :  [a,  b\  — >■  C  is  the  function  to  be  determined.  Proofs  of  the 
following  theorem  can  be  found  in  [4],  as  well  as  in  many  other  sources. 

Theorem  2.3  The  equation  (16)  always  has  a  unique  solution  on  the  interval  [a,  6].  If 
both  functions  K,a  are  k  times  continuously  differentiable,  the  solution  is  also  k  times 
continuously  differentiable. 
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2-5  Prolate  Spheroidal  Wave  Functions 

In  this  subsection,  we  summarize  certain  facts  about  the  Prolate  Spheroidal  Wave  Func¬ 
tions.  Unless  stated  otherwise,  all  these  facts  can  be  found  in  [20,  17]. 

Given  a  real  c  >  0,  we  will  denote  by  Fc  the  operator  L2[-l,  1]  ->•  L2f-1,  ll  defined 
by  the  formula 

WW  =  £etoV(0A  (17) 

Obviously,  Fc  is  compact;  we  will  denote  by  Ao,  Aj, . . . ,  An, . . .  the  eigenvalues  of  Fc 
ordered  so  that  lA^-il  >  | Ay  |  for  all  natural  j.  For  each  non-negative  integer  j,  we  will 
denote  by  ipj  the  eigenfunctions  corresponding  to  Ay,  so  that 

XA(x)  =  J\icxt^(t)  dt,  (18) 

for  all  x  e  [-1, 1];  we  adopt  the  convention  that  the  functions  are  normalized  such  that 

ll^j'IUa[-i,i]  =  h  for  a11  J-1  The  following  theorem  is  a  combination  of  several  lemmas 
from  [20], [6], [11], 


Theorem  2.4  For  any  positive  real  c,  the  eigenfunctions  ^0,  Vh.  •  • . ,  of  the  operator  Fc 
are  purely  real,  are  orthonormal,  and  are  complete  in  L2[-l,l].  The  even-numbered 
eigenfunctions  are  even,  and  the  odd-numbered  ones  are  odd.  All  eigenvalues  of  Fc 
are  non-zero  and  simple;  the  even-numbered  eigenvalues  are  purely  real,  and  the  odd- 
numbered  ones  are  purely  imaginary;  in  particular,  A;  =  P\Xj\.  The  functions  consti¬ 
tute  a  Chebychev  system  on  the  interval  [-1, 1];  in  particular,  the  function  ipi  has  exactly 
i  zeroes  on  that  interval,  for  any  i  =  0, 1, . . . ,. 


We  will  define  the  self-adjoint  operator  Qc :  — 1, 1]  — >  — 1, 1]  by  the  formula 

a  simple  calculation  shows  that 


(19) 

(20) 


that  Qc  has  the  same  eigenfunctions  as  Fc,  and  that  the  j-th  (in  descending  order) 
eigenvalue  p,j  of  Qc  is  connected  with  A  j  by  the  formula 

_ ^  (21) 

This  convention  differs  from  that  used  in  [20];  however,  the  present  paper  is  concerned  almost 
exclusively  with  approximation  of  functions  on  [-1,1],  and  in  that  context,  the  convention  that  the 
functions  {ipj}  have  unit  norm  on  that  interval  is  by  far  the  most  convenient. 
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The  operator  Qc  is  obviously  closely  related  to  the  operator  Pc :  L2[-cc:  oc]  ->  [-oc.  oc] 
defined  by  the  formula 


sin(c  ■  ( x  —  t )) 
x  —  t 


■  <p(t)dt , 


(22) 


which,  as  is  well  known,  is  the  orthogonal  projection  operator  onto  the  space  of  functions 
of  band  limit  c  on  (— oc,  oo). 

For  large  c,  the  spectrum  of  Qc  consists  of  three  parts:  about  2 c/tt  eigenvalues  that 
are  very  close  to  1,  followed  by  order  log(c)  eigenvalues  which  decay  exponentially  from  1 
to  nearly  0;  the  remaining  eigenvalues  are  all  very  close  to  zero.  The  following  theorem, 
proven  (in  a  slightly  different  form)  in  [19],  describes  the  spectrum  of  Qc  more  precisely. 


Theorem  2.5  For  any  positive  real  c  and  0  <  a  <  1  the  number  N  of  eigenvalues  of 
the  operator  Qc  that  are  greater  than  a  satisfies  the  inequality 

2c  (  l  1  -  a\  , 

~  +  (^2  loS  )  lo§(c)  ~  10  •  Iog(c)  <  N  <  (23) 

7  +  {-^2  loS  -7- )  loS(c)  +  !0  •  log(c). 


By  a  remarkable  coincidence,  the  eigenfunctions  ip0,  ^  •  •  • ,  ifn  of  the  operator  Qc  turn 
out  to  be  the  Prolate  Spheroidal  Wave  functions,  well-known  from  classical  Mathematical 
Physics  (see,  for  example,  [16]).  The  following  theorem  formalizes  this  statement;  it  is 
proven  in  a  considerably  more  general  form  in  [21]. 


Theorem  2.6  For  any  c  >  0,  there  exists  a  strictly  increasing  sequence  of  positive  real 
numbers  Xo*  Xi>  •  ■  •  such  that  for  each  j  >  0,  the  differential  equation 

(1  -  x2)  ip"(x)  -  2x  ip'(x)  +  (; Xj  -  c2  x2)  ip(x)  =  0  (24) 

has  a  solution  that  is  continuous  on  the  interval  [—1, 1],  For  each  j  >  0,  the  function  ipj 
( defined  in  Theorem  2.4 )  is  the  solution  of  (24). 


3  Analytical  Apparatus 

3.1  Prolate  Series 

Since  the  functions  t/^)  ipi, . . . ,  ifn, . . .  are  a  complete  orthonormal  basis  in  L2[— 1, 1],  any 
formula  for  the  inner  product  of  prolate  spheroidal  wave  functions  with  another  function 
/  is  also  a  formula  for  the  coefficients  of  an  expansion  of  /  into  prolate  spheroidal  func¬ 
tions  (which  we  will  refer  to  as  the  prolate  expansion  of  /).  Thus  the  following  theorem 
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provides  the  coefficients  of  the  prolate  expansion  of  the  derivative  of  a  prolate  spheroidal 
function,  and  also  the  coefficients  of  the  prolate  expansion  of  a  prolate  spheroidal  wave 
function  multiplied  by  x.  Those  coefficients  are  also  the  entries  of  the  matrix  for  differen¬ 
tiation  of  a  prolate  expansion  (producing  another  prolate  expansion),  and  the  entries  of 
the  matrix  for  multiplication  of  a  prolate  expansion  by  x,  respectively.  (These  formulae 
are  not,  however,  suitable  for  producing  such  matrices  numerically,  since  in  many  cases 
they  exhibit  catastrophic  cancellation.) 


Theorem  3.1  Suppose  that  c  is  real  and  positive,  and  that  the  integers  m  and  n  are 
non-negative.  If  m  =  n  (mod  2),  then 

V>n(z)  rj>m{x)  dx  =  jl  xipn(x)  ipm(x)  dx  =  0.  (25) 

Ifm^n  (mod 2),  then 

f_^n(x)Mx)  dx  =  ~^|^m(l)^„(l),  (26) 

j  xi!n(x)lbm{x)  dx  =  T/Vn(l)^(l).  (27) 

lc  Am  An 

Proof.  Since  the  functions  xpj  are  alternately  even  and  odd,  (25)  is  obvious.  In  order  to 
prove  (26),  we  start  with  the  identity 

Kfpn  =  eicx£  ipn(t)  dt  (28) 

(see  (18)  in  Subsection  2.5).  Differentiating  (28)  with  respect  to  x,  we  obtain 

K^'n{x)  =  icf^teicxtipn(t)  dt.  (29) 

Projecting  both  sides  of  (29)  on  V'm  and  using  the  identity  (28)  (with  n  replaced  with 
m)  again,  we  have 


K  [  ip'n{x)  dx 

J  —  1 

=  *  C  J^mix)  J  t  eicxt  xl>n[t)  dt  dx 
=  ic  J^tipn(t)  J  ^  eicxtTpm(x)  dxdt 

=  z’cA mf  tij>n(t)il>m(t)  dt.  (30) 


8 


Obviously,  the  above  calculation  can  be  repeated  with  m  and  n  exchanged,  yielding  the 
identity 

Xm  1pn(x)  dx  -  i  C  Xn  j  1 1pn{t)  Vm(t)  dt  ]  (31) 

combining  (30)  with  (31).  we  have 

CW  ^n(z)  dx  =  —  J  ^  ym(x)  <(x)  dx.  (32) 

On  the  other  hand,  integrating  the  left  side  of  (32)  by  parts,  we  have 
J^ip'm(x)  ipn(x)  dx 

=  tMl)^n(l)  -  ^m(-l)^n(-l)  ~  f'  <(s)  V-'m(x)  dx.  (33) 

Since  m  ^  n  (mod  2),  we  rewrite  (33)  as 

j  ^m{x)tpn{x)  dx 


-  2  *0771(1)  1pniX)  —  Vn{x)  Ipm^x)  dx. 
Now,  combining  (32)  and  (34)  and  rearranging  terms,  we  gi 


/i  9 

K(x)  Mx)  dx  =  —  m 

/Sn  t 

Substituting  (30)  into  (35),  we  get 
J  xipn{x)%l)m(x)  dx 
=  f  (x)^m(x)dx 

Am  J  —  l 

1  A  9  A2 

=  ^bTsWUWH 


^m(l)  V»»(l)  • 


ic  Am  A^  +  A£ 
2  AmAn 
Sa£Ta|*" 


(34) 

(35) 

(36) 
□ 


The  following  corollary,  which  is  an  immediate  consequence  of  (32),  finds  use  in  the 
numerical  evaluation  of  the  eigenvalues  { Aj } : 
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Corollary  3.2  Suppose  that  c  is  real  and  positive,  and  that  the  integers  m  and  n  are 
non-negative.  Ifm^n  (mod 2),  then 


A2  /,  dx 

_  J  - 1 _ _ _ 

^  f_ltfm(x)$n{x)  dx 


(37) 


3.2  Decay  of  Legendre  Coefficients  of  Prolate  Spheroidal  Wave- 
functions 

Since  each  of  the  functions  ipj  is  analytic  on  (D,  on  the  interval  [—1, 1]  it  can  be  expanded 
in  a  Legendre  series  of  the  form 

00 

Mx)  =  E  PkK(x),  (38) 

k=0  ' 

with  the  coefficients  (3k  decaying  superalgebraically;  the  following  two  theorems  establish 
bounds  for  the  decay  rate. 


Lemma  3.3  Let  Pn(x)  be  the  n-th  normalized  Legendre  polynomial  (defined  in  (13)). 
Then  for  any  real  a, 


f  eiax  Pn(x)  dx 

=  E  <**  /  *“  F$(x)  dx  +  i  E  Pk  f  x2k+1  K(x)  dx. 
k=ko  j~l  k=ko  J~l 

where 


(39) 


ak  =  (-1)' 

ft  =  (-1)' 


,2Jfc 


(2  k)<: 


a 


2fc+l 


(2k  +  1)!’ 


k0  =  [tz/2J  . 

Furthermore,  for  all  integer  m  >  [e  ■  |a|J  +  1, 


[  e'ax  pn(x)  dx  -  <*k  [  x2k  Pn(x)  dx 

1  k=k0  •/~1 

771-1  r  1  / 1  \  2m 

-  *  E  0k  [  x2k+lPn(x)  dx  <  (-)  . 

k=kn  J~1  '4/ 


(40) 

(41) 

(42) 


(43) 
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In  particular,  if 


n>2  (Le- |a|J +1)  ,  (44) 

then 

J_^eiaxPf{x)  dx |<0  .  (45) 

Proof.  The  formula  (39)  follows  immediately  from  Lemma  2.2  and  Taylor’s  expansion 
of  emx.  In  order  to  prove  (43),  we  assume  that  m  is  an  integer  such  that 

m>[e-  |a|J  +  1 .  (46) 

Introducing  the  notation 


Rrn  —  f  X2k  Pn{x)  dx  +  i  0k  [  x'2k+1  Pn(x )  dx, 

k=m  *  k—m 

we  immediately  observe  that,  due  to  Lemma  2.2  and  the  triangle  inequality, 

w»i  s  t  (l4 


(47) 


/c=2m 


k !  V  ^  +  1 


00  I  nk 

<  V  - — . 
^  k\ 


k=2m 


Since  (46)  implies  that 

|a| 


1  1 


2m  +  k  <  2m  <  2e  <  2  ’ 
for  all  integer  m,k  >  0,  we  rewrite  (48)  as 

1 2m 


(48) 


(49) 


\Rm\  < 


a 


(2m)! 

I„|2m 


,  1  1  A 

1+2  +  4+'") 


<  2 


(2m) !  ’ 


(50) 


and  obtain  (43)  immediately  using  Stirling’s  formula.  Finally,  we  obtain  (45)  by  choosing 
m=[e-  |a|J  +  1 .  (51) 


□ 
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Theorem  3.4  Let  ipm(x)  be  the  m-th  prolate  spheroidal  function  with  band  limit  c,  let 
Pk(x)  be  the  k-th  normalized  Legendre  polynomial  (defined  in  (13)),  and  let  Xm  be  the 
eigenvalue  which  corresponds  to  ipm(x)  (as  in  Theorem  2.4).  Then  for  all  integer  m  >  0 
and  all  real  positive  c,  if 

k  >  2  (|_e  •  cj  +  1)  ,  (52) 

then 


j  1Pm(x)  Pk(x)  dx 
Moreover,  given  any  £  >  0,  if 


Xm  \2 


k-1 


then 


k  >  2  ( [e  •  cj  +  1)  +  log2  Q  j  +  log2  (^-  j , 
[  ‘d>m(x)Pf(x)  dx 

J  —  1 


<  £  . 


Proof.  Obviously 

[  d) m(x )  Pk(x)  dx 

J  —  1 


-  \hV-iMx)(f-leia‘Pkit)dt) dz 


< 


|A 


—  fl\kn(x)  |-  f1  eicxt  Pk(t)  dt 
— 1  «/  — 1 


dx . 


Introducing  the  notation 


(53) 

(54) 

(55) 


(56) 


a  =  cx, 

and  remembering  that 

J  |‘0Tn(ar)|  dx  =  1 . 

we  observe  that  the  combination  of  (56),  (57),  (58),  and  Lemma  3.3  implies  that 

[  Pk(x)  dx 

1  /1\*_1  f1 , 

<  jAj’UJ  U^{x)\dx 

=  — 

|Am|  V2 J  ’ 

Substituting  (54)  into  (53),  we  immediately  see  (55). 


(57) 

(58) 


(59) 

□ 
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4  Numerical  Evaluation  of  Prolate  Spheroidal  Wave- 
functions 


Both  the  classical  Bouwkamp  algorithm  (see,  for  example,  [1])  for  the  evaluation  of  the 
functions  i>j,  and  the  algorithm  presented  in  this  paper  for  the  same  task,  are  based  on 
the  expression  of  those  functions  as  a  Legendre  series  of  the  form 

OO 

t 'j(x )  =  E  akPk(x):  (60) 

k= 0 

since  the  functions  "tpj  are  smooth,  the  coefficients  ak  decay  superalgebraically  (with 
bounds  for  that  decay  being  given  in  Theorem  3.4).  Substituting  (60)  into  (24),  and 
using  (8)  and  (11),  we  obtain  the  well-known  three-term  recursion 


(k  +  2  )(k  +1)2 
(2fc  +  3)(2&  +  5)  'C  'a*+2  + 

( k(k  Mil  %k(k  +  1)  —  1  2 

r+1)+sT3)(2^i)-c 

k(k- 1)  2 


Oik  + 


c  •  ak-2  =  0. 


(2A:  —  3)  (2A:  —  1) 

Combining  (61)  with  (13),  we  obtain  the  three-term  recursion 

(*  +  2)(fe  +  l) 


(2k  4-  3)-y/(2/c  +  5)(2&  +  1) 
(k(k  +  1)  + 


c2-Pl+2  + 


2k(k+l)-l  2 


(2k  +  3)  (2k  —  1) 
k(k-l) 


(2k  —  l)yj (2k  —  3)(2A;  +  1)  ' 
for  the  coefficients  /3q,  P{,  ■  ■  .  of  the  expansion 

Mx)  =  E  @k-P~k(x); 


c2  ~  Xj )  •  Pi  + 

•  Pi-2 


.2  qi  _  o 


0 


(61) 


(62) 


(63) 


for  each  j  =  0, 1, 2, ....  we  will  denote  by  B!  the  vector  in  (2  defined  by  the  formula 

M  MM, $,...)■  (64) 

The  following  theorem  restates  the  recursion  (62)  in  a  slightly  different  form. 
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Theorem  4.1  The  coefficients  Xi  ore  the  eigenvalues  and  the  vectors  (3l  are  the  corre¬ 
sponding  eigenvectors  of  the  operator  l 2  — >  /2  represented  by  the  symmetric  matrix  A 
given  by  the  formulae 


4  _  utu  i  in  |  2k(k  4- 1)  —  1  2 

^  k{k  +  1)+(2k  +  3)(2k-l)-C' 

(65) 

( k  +  2)  [k  +  1)  2 

(66) 

jt* _|_0  ==  — - - -  . 

(2*  +  3)y(24  +  l)(2i  +  o) 

(*  +  2)(*  +  l)  ■  , 

jU  —  - — — — - -  .  q*- 

(2k  +  3)\J(2k  +  1)(2&  +  5) 

(67) 

for  all  k  =  0, 1, 2, . . with  the  remainder  of  the  entries  of  the  matrix  being  zero. 

In  other  words,  the  recursion  (62)  can  be  rewritten  in  the  form 

{A  -  Xj  ■  =0,  (68) 

where  A  is  separable  into  two  symmetric  tridiagonal  matrices  Aeven  and  Aodd,  the  first 
consisting  of  the  elements  of  A  with  even-numbered  rows  and  columns  and  the  second 
consisting  of  the  elements  of  A  with  odd-numbered  rows  and  columns.  While  these  two 
matrices  are  infinite,  and  their  entries  do  not  decay  much  with  increasing  row  or  column 
number,  the  eigenvectors  {ffi}  of  interest  (those  corresponding  to  the  first  m  prolate 
spheroidal  functions)  lie  almost  entirely  in  the  leading  rows  and  columns  of  the  matrices 
(as  shown  by  Theorem  3.4).  Thus  the  evaluation  of  prolate  spheroidal  functions  can  be 
performed  by  the  following  procedure: 

•  1.  Generate  the  leading  k  rows  and  columns  of  A,  where  k  is  given  by  (54). 

•  2.  Split  the  generated  portion  of  A  into  Aeven  and  Aodd,  and  use  a  solver  for  the 
symmetric  tridiagonal  eigenproblem  (such  as  that  in  LAPACK)  to  compute  their 
eigenvectors  {(5j)  and  eigenvalues  {y^}. 

•  3.  Use  the  obtained  values  of  the  coefficients  /3&,  ...  in  the  expansion  (63)  to 

evaluate  the  function  ipj  at  arbitrary  points  on  the  interval  [—1,1]. 

Obviously  steps  1  and  2  can  be  performed  as  a  precomputation,  for  any  given  value  of 
c.  As  a  numerical  diagonalization  of  a  positive  definite  tridiagonal  matrix  with  well- 
separated  eigenvalues,  this  precomputation  stage  is  numerically  robust  and  efficient, 
requiring  0(cm )  operations  to  construct  the  Legendre  expansions  of  the  form  (64)  for  the 
first  m  prolate  spheroidal  functions;  each  subsequent  evaluation  of  a  prolate  spheroidal 
function  takes  0(c)  operations. 
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4.1  Numerical  Evaluation  of  Eigenvalues 

Although  the  above  algorithm  for  the  evaluation  of  prolate  spheroidal  wave  functions  also 
produces  the  eigenvalues  {x;}  of  the  differential  operator  (24).  it  does  not  produce  the 
eigenvalues  {A.,}  of  the  integral  operator  Fc  (defined  in  (17)).  Some  of  those  eigenvalues 
can  be  computed  using  the  formula 

XjVjix)  =  eicxt ipj(t)  dt,  (69) 

evaluating  the  integral  on  the  right  hand  side  numerically;  however,  that  evaluation 
obviously  has  a  condition  number  of  about  1/A j,  and  is  thus  inappropriate  for  computing 
small  A  j.  A  well-conditioned  procedure  is  as  follows: 

•  1.  Use  (69)  to  calculate  A0,  evaluating  the  right  hand  side  numerically,  and  with 
x  =  0  (so  that  Vo(x)  is  not  small). 

•  2.  Use  the  calculated  Ao,  together  with  Corollary  3.2,  to  compute  the  absolute  val¬ 
ues  |Ajj.  for  7  =  1.2. ...,  m,  computing  each  |Aj|  from  |A*_i|  (and  again,  evaluating 
the  required  integrals  numerically). 

•  3.  Use  the  fact  that  A;  =  P\Xj\  (see  Theorem  2.4)  to  finish  the  computation. 


5  Quadratures  for  Band-Limited  Functions 

Since  the  prolate  spheroidal  wave  functions  ip0,  . . . ,  ipn, . . .  constitute  a  complete  or¬ 
thonormal  basis  in  L2[- 1, 1]  (see  Theorem  2.4), 

elCXt  =  fl(l  elcxr Tpj(T)  dr)  ipj(t),  (70) 

for  all  x,  t  €  [—1.  lj:  substituting  (18)  into  (70)  yields 

elcxt  =  (71) 

j=0 

Thus  if  a  quadrature  integrates  exactly  the  first  n  eigenfunctions,  that  is,  if 

“  rl 

z2wkVj{xk)  =  /  Tp^x)  dx,  (72) 

k= i  J~1 


15 


for  all  j  0, 1, . . .  ,n  1,  then  the  error  of  the  quadrature  when  applied  to  a  function 
f(x)  =  elcax,  with  a  €  [-1, 1],  is  given  by 


A  r1 

E  wkelcaXk  -  /  eicax  dx 

k= 1  J~l 

=  E  wk  A;  %(o)  ^(H^  —  J  f  ^jT  \j  ipj(a)  ipj(x)  j  dx 


~  E  wk  Aj  'tpj (a)  il>j{xk)j  -  ^  ^  Aj  ipj (o)  dx. 
Due  to  the  orthonormality  of  the  functions  {tpj}, 


E  Aj  HG)  VH) 


P=n 


1 


£  l^-P- 


J=n 


(73) 


(74) 


From  (74),  it  is  obvious  that  the  error  of  integration  (73)  is  of  roughly  the  same  mag¬ 
nitude  as  An,  provided  that  n  is  in  the  range  where  the  eigenvalues  {A7}  are  decreasing 
exponentially  (as  is  the  case  for  quadratures  of  any  useful  accuracy;  see  Theorem  2.5) 
and  provided  in  addition  that  the  weights  are  not  large. 

Now,  the  existence  of  an  n/ 2-point  quadrature  that  is  exact  for  the  first  n  Prolate 
Spheriodal  Wave  functions  follows  from  the  combination  of  Theorems  2.1,  2.4;  an  al¬ 
gorithm  for  the  numerical  evaluation  of  nodes  and  weights  of  such  quadratures  can  be 
found  in  [2] .  An  alternative  procedure  for  the  construction  of  quadrature  formulae  for 
band-limited  functions  (leading  to  slightly  different  nodes  and  weights)  is  described  in 
the  following  section;  a  numerical  comparison  of  the  two  can  be  found  in  Section  8  below. 


Remark  5.1  The  above  text  considers  only  the  error  of  integration  of  a  single  exponen¬ 
tial.  For  a  band-limited  function  g  :  [—1,1]  — >•  (D  given  by  the  formula 


g(x)  =f'  G(t)e>a  it 


(75) 


for  some  function  G  :  [-1, 1]  — >•  €,  the  error  is  obviously  bounded  by  the  formula 


r1 

E  v>k9(xk)  ~  /  g(x)  dx 

k= 1  J~l 


<  HIGH, 


(76) 


where  e  is  the  maximum  error  of  integration  (73)  of  a  single  exponential,  for  any  t  <E 
[—1, 1].  While  ||G||  might  be  much  larger  than  ||^||[_i,i]  (as  it  is  if,  for  instance,  g  =  ipzo-n), 
if  the  same  equation  (75)  is  used  to  extend  g  to  the  rest  of  the  real  line,  then  by  Parseval’s 
formula  ||G||  =  ||p||(_oo,oo);  that  is  to  say,  although  the  error  of  such  a  quadrature  when 
applied  to  a  band-limited  function  is  not  bounded  proportional  to  the  norm  of  that 
function  on  the  interval  of  integration,  it  is  bounded  proportional  to  the  norm  of  that 
function  on  the  entire  real  line. 
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6  Quadrature  Nodes  from  Roots  of  Prolate  Func¬ 
tions 

An  alternative  to  the  approach  of  the  previous  section  is  to  use  roots  of  appropriate 
prolate  spheroidal  wave  functions  as  quadrature  nodes,  with  the  weights  determined  via 
the  procedure  described  in  Remark  2.1.  The  following  theorems  provide  a  basis  for  this; 
numerically  (see  Section  8)  the  resulting  quadrature  nodes  tend  to  be  inferior  to  those 
produced  by  the  optimization  scheme  of  [13,  25,  2];  however,  they  are  useful  as  starting 
points  for  that  scheme,  or  as  somewhat  less  efficient  nodes  which  can  be  computed  much 
more  quickly. 

6.1  Euclid  Division  Algorithm  for  Band-Limited  Functions 

The  following  two  theorems  constitute  a  straightforward  extension  to  band-limited  func¬ 
tions  of  Euclid’s  division  algorithm  for  polynomials.  Their  proofs  are  quite  simple,  and 
are  provided  here  for  completeness,  since  the  author  failed  to  find  them  in  the  literature. 

Theorem  6.1  Suppose  that  a,  :  [0, 1]  — t  (D  are  a  pair  of  c2—  functions  such  that 


</>(!)  ±  0,  (77) 

c  is  a  positive  real  number,  and  the  functions  f,p  are  defined  by  the  formulae 

/(*)  =  Jo  <?(t)  e2icxt  dt ,  (78) 

P (x)  =  Jq  <p(t)  elcxt  dt.  (79) 

Then  there  exist  two  cl -functions  r],t;  :  [0, 1]  — >  <D  such  that 

f{x)  =  p(x)q(x) +r(x)  (80) 

for  all  x  6  R,  with  the  functions  q ,  r  ;  [0, 1]  — >  R  defined  by  the  formulae 

q (x)  =  jo  77(f)  eicxt  dt ,  (81) 

r{x)  =  i{t)  eicxt  dt.  (82) 
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Figure  1:  The  split  of  integration  range  that  yields  (85) 


Proof. 

Obviously,  for  any  functions  p.q  given  by  (79),  (81), 


p(x)q(x)  = 

£  At)  eicxt  dt  ■  f\(r)eicxr  dr 

~  L  L  ^  eIcl(t+T)  drdt. 

Defining  the  new  independent  variable  u  by  the  formula 

(83) 

U  —  t  ~r  T, 

we  rewrite  (83)  as 

(84) 

p{x)q{x)  = 

lo  €'CUX  Jo  T)  V(T)  dT  du 

+ 

[  etcux  f  <p(u—r)  t](t)  drdu 

Ju- 1 

(85) 

(see  Figure  1).  Substituting  (78),  (82),  and  (85)  into  (80),  we  get 
lo  lo  f(^-t)  v(t)  drdu 
+  L  L-\  '^u~T)^T)  drdu  +  J  £(t)  eicxt  dt 

rw  „•  /*i 

=  /„  dt  + dt.  (86) 

Due  to  the  well  known  uniqueness  of  the  Fourier  Transform,  (86)  is  equivalent  to  two 
independent  equations: 

l  ^  lo  ^u~T)v(r)  drdu  +  j\{t)eicxtdt 

rl/2 

=  l  a«)  e2'“‘  dt,  (87) 
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(88) 


/  £tcux  l  <p(u—t) jj(t)  drdu  =  f  a(t) e2lcxt  dt. 

Ju—1  J 1/2 

Now,  we  observe  that  (88)  does  not  contain  £.  and  use  it  to  obtain  an  expression  for  77 
as  a  function  of  99,  a .  After  that,  we  will  view  (87)  as  an  expression  for  <f  via  <p%  a .  77. 
From  (88)  and  the  uniqueness  of  the  Fourier  Transform,  we  obtain 

ju_^{u-T)-n{T)  dr  =  *  (89) 

for  all  u  €  [1,2].  Introducing  the  new  variable  v  via  the  formula 


v  =  u  -  1, 
we  convert  (89)  into 


(90) 


L 


(p(v+ 1  — t)  77(7)  dr  =  a(^4— )> 


(91) 


which  is  a  Volterra  equation  of  the  first  kind  with  respect  to  77;  differentiating  (91)  with 
respect  to  v,  we  get 


V(v)  +  <p\v+1-t)t](t)  dr  =  (92) 

which  is  a  Volterra  equation  of  the  second  kind.  Now,  the  existence  and  uniqueness 

of  the  solution  of  (92)  (and,  therefore,  of  (89)  and  (88))  follows  from  Theorem  2.3  of 
Section  2. 

With  77  defined  as  the  solution  of  (89),  we  use  (87)  together  with  the  uniqueness  of 
the  Fourier  Transform,  to  finally  obtain 


€(u)  =  a(^)  “  J  <p(u  —  t)  77 (r)  dr, 
for  all  u  €  [0, 1], 

The  following  theorem  is  a  consequence  of  the  preceding  one. 


(93) 

□ 


Theorem  6.2  Suppose  that  o,<p  :  [—1,1]  — *  C  are  a  pair  of  c2— functions  such  that 

^(— 1)  ^  0?  <?{  1)  7^  0,  c  is  a  positive  real  number,  and  the  functions  f,p  are  defined  by 
the  formulae 


/(*)  = 

P(x)  =  j_^  <p(t) 


g2  icxt 


eiext 


(94) 

(95) 
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Then  there  exist  two  cl -functions  77,  £  :  [-1, 1]  <D  such  that 
f{x)  =  p{x)  q(x)  +  r(x) 

for  all  i  6  R,  with  the  functions  q,  r  :  [— 1, 1]  — »  R.  defined  by  the  formulae 
q(x)  =  J 1  T)(t)  eicxt  dt , 

r{x)  =  J\(t)eicxt  dt. 

Proof. 

Defining  the  functions  f+,  f_,p+,  p_;  bv  the  formulae 
/+(*)  =  t  cr(t)  e2icxt  dt, 

f-(x)  =  [°  a(t)e2icxt  dt, 

J  —  1 

P+(x)  =  [  <p{t)  eicxt  dt. 

P-{x)  =  [ °  y{t)eicxt  dt, 

J  —  1 

we  observe  that  for  all  x  €  R1, 

fix)  =  /+(*)  +  /-(*). 

p(x)=p+(x)  +p-(x). 

Due  to  Theorem  6.1.  there  exist  such  r?+,  77.,  <f+,  that 
f+{x)  =  p+(x)  q+(x)  +  r+(x), 

/-  0*0  =P-{x)  q-  (x)  +  r_  (a?) , 
with  the  functions  9+,9->7'+,r_  defined  by  the  formulae 

q+{x)=  f\+{t)  eicxt  dt. 
q-{x)  =  j  V-(t)  e“‘  dt. 


(96) 

(97) 

(98) 

(99) 
(100) 
(101) 
(102) 

(103) 

(104) 

(105) 

(106) 

(107) 

(108) 
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(109) 


r+(i)=  [l  £+(f)  e'm  dt, 

Jo 

r-ix)  =  eicxt  dt.  (110) 

Now.  defining  q,  by  the  formula 

q(x)  =  q_(x)  +  q+(x)  (111) 

for  all  x  €  [—1, 1],  we  have 

p(x)q(x)  =  (p_(x)  +  p+(x))  ■  (q_(x)  +  q+(x)) 

=  P+{x)q+(x)+P-(x)Q-(x)+P-(x)q+(x)+P+{x)q-{x),  (112) 

and  we  define  r(x)  by  the  obvious  formula 

r(x)  =  r.(x)  +  r+(x)  -  (p_(x)q+(x)  +  p+(x)q.(x)).  (113) 

□ 


6.2  Quadrature  nodes  from  the  division  theorem 

In  much  the  same  way  that  the  division  theorem  for  polynomials  can  be  used  to  provide 
a  constructive  proof  of  Gaussian  quadratures,  Theorem  6.2  provides  a  method  of  con¬ 
structing  generalized  Gaussian  quadratures  for  band-limited  functions.  The  method  is 
as  follows. 

To  construct  a  quadrature  for  functions  of  a  bandwidth  2c,  prolate  spheroidal  wave 
functions  corresponding  to  bandwidth  c  are  used.  (Thus  the  eigenvalues  {A^}  and  eigen¬ 
functions  {V’j}  in  this  section,  as  elsewhere  in  the  paper,  those  corresponding  to 
bandwidth  c).  The  following  theorem  provides  a  bound  of  the  error  of  a  quadrature 
whose  nodes  are  the  roots  of  the  n’th  prolate  function  ij)n,  when  applied  to  a  function 
/  which  satisfies  the  conditions  of  the  division  theorem,  in  terms  of  the  norms  of  the 
quotient  and  remainder  of  /  divided  by  v7l : 

Theorem  6.3  Suppose  that  x1:x2,...,xn  e  R  are  the  roots  of  xpn.  Let  the  numbers 
W\,  w2, . . . ,  wn  €  It  be  such  that 


f^Wktpjixk)  =  /  ^(x)  dx, 
k= 1  J~l 


(114) 
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for  all  j  0, 1, . . . ,  n  1.  Then  for  any  function  f  :  [—1,1]  — >  C  which  satisfies  the 
conditions  of  Theorem  6.2, 

n  r 1 

wkf(xk )  -  /  f(x)  dx 

k= 1 

°°  /  m  \ 

<  W-WI  +  llfll •  E M ■  IWIi •  2  +  EIWI  ,  (ns) 

\  k=i  J 

where  the  functions  r],  f  :  [-1, 1]  -+  C  are  as  defined  in  Theorem  6.2. 

Proof.  Since  /  satisfies  the  conditions  of  Theorem  6.2,  there  exist  functions  q .  r  : 
[-1, 1]  -*•  R  defined  by  (97), (98)  such  that 


f(x)  =  'ifn{x)q(x)  +  r(x). 

Then,  defining  the  error  of  integration  Ef  for  the  function  /  by 
Ef  =  f2wkf(*k)  ~  [  f{x)  dx 

k= 1  •'-l 


(116) 

(117) 


"  ri 

—  z2wk{i>n{xk)q(xk)  +  r(xk))  —  {ifn{x)  q(x)  +  r(x))  dx 

k—l  J  —  l 

n  j 

-  Wk  lfn{Xk)  q(Xk)  -  f  tpn(x)  q(x)  dx 

k= 1  J~l 

71  f1  I 

+  wkr(xk)  -  /  r(x)  dx\ 
k= i  •/~1  I 


Since  the  nodes  {x*}  are  the  roots  of  ipn, 


Wk*Pn(xk)q{xk)  =  0. 

k=i 


Thus 


(118) 

(119) 


Ef  ^  \{_x  tPnix)  q(x)  dx  +  ^2  xokr(xk)  -  f  r(x)  dx 

k= 1  1 

Now 

J_iMx)q{x)dx  =  J\n(x)  J\(t)eicxt  dtdx 
=  J\(t)  f\n(x)elcxt  dxdt 
=  [  V(t)Klpn{t)  dt. 

3  —  1 


(120) 


(121) 
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Using  the  Cauchy-Schwartz  inequality  and  the  fact  that  the  function  %bn  has  unit  norm, 
we  get  from  (121)  that 


J^ipn{x)q(x) 


dx 


<  |A„| 


(122) 


Also, 


X  wkr(xk)  -  [  r( x)  dx 

fc=i 

-  t  «*  (I.1,  at)  it )  -  J\  (£  rn «-  *)  i* 

=  j_x  at)  -  J 1  e*“'  dii  di.  (123) 

Substituting  (73)  into  (123),  and  using  the  Cauchy-Schwartz  inequality,  we  get 
X)  wkr(xk)  -  f  r(x)  dx 

k= 1  •/_1 

/•l  /  m  /  oo 

=  /  .  £(*)  X  X 

1  \*=i  v=» 

dx  j  dt 

OC  /  m  \ 

^  Ilf  II  •  X  !ajI  •  UiWlo  ■  (2  +  X IKII)  •  (124) 

j=n  \  k= 1  / 

Combining  (120),  (122),  and  (124),  we  get 

oo  /  m  \ 

e,  <  W-M  +  WI-EW-WIL-  2  +  EKII  •  (125) 

i=n  \  k= 1  / 

□ 


Remark  6.1  The  use  of  Theorem  6.3  for  the  construction  of  quadrature  rules  for  band- 
limited  functions  depends  on  the  fact  that  the  norms  of  the  band-limited  functions  q 
and  t  in  (116)  are  not  large,  compared  to  the  norm  of  /  (both  sets  of  norms  being  on 
[-00,00]).  Such  estimates  have  been  obtained  for  all  n  >  2c/tt  +  101og(c).  The  proofs 
are  quite  involved,  and  will  be  reported  at  a  later  date.  In  this  paper,  we  demonstrate 
the  performance  of  the  obtained  quadrature  formulae  numerically  (see  Section  8  below) . 
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Remark  6.2  It  is  natural  to  view  (116)  as  an  analogue  for  band-limited  functions  of 
the  Euclid  division  theorem  for  polynomials.  However,  there  are  certain  differences.  In 
particular,  Theorem  6.1  admits  extensions  to  band-limited  functions  of  several  variables, 
while  the  classical  Euclid  algorithm  does  not.  Such  extensions  (together  with  several 
applications)  will  be  reported  at  a  later  date. 


7  Interpolation  via  Prolate  Spheroidal  Wavefunctions 

Interpolation  is  usually  performed  by  the  following  general  procedure:  assuming  that  the 
function  /  :  [a,  6]  — >  C  to  be  interpolated  is  given  by  the  formula 

f(x)  =  Ci<f>i(x)  +  c2(j)2{x)  +  . . .  +  c„^n( x),  (126) 

where  <Pi,(f>2,-..,<t>n  ■  [a,  &]  — *  C  are  a  fixed  sequence  of  functions  (often  polynomials), 
solve  annxn  linear  system  to  determine  the  coefficients  c1,c2,...,cn  from  the  values  of 
/  at  the  n  interpolation  nodes,  then  use  (126)  to  evaluate  /  wherever  needed.  As  is  well 
known,  if  /  is  well-approximated  by  a  linear  combination  of  the  interpolation  functions, 
and  if  the  linear  system  to  be  solved  is  well-conditioned,  then  this  procedure  is  accurate. 

As  shown  in  Section  5  in  the  context  of  quadratures,  a  linear  combination  of  the  first 
n  prolate  spheroidal  functions  V'o.V'i,  •  •  for  a  band  limit  c  can  provide  a  good 

approximation  to  functions  of  the  form  eicxt,  with  t  E  [-1, 1]  (see  (71,74));  in  the  regime 
where  the  accuracy  is  numerically  useful,  the  error  is  of  the  same  order  of  magnitude  as 
|An|.  This,  in  turn,  shows  that  they  provide  a  good  approximation  (in  the  same  sense  as 
in  Remark  5.1)  to  any  band-limited  function  of  band  limit  c.  Thus,  if  $>,  Vh, . . .  ,^-1  are 
used  as  the  interpolation  functions  in  this  procedure,  they  can  be  expected  to  vield  an 
accurate  interpolation  scheme  for  band-limited  functions,  provided  that  the  matrix  to  be 
inverted  is  well-conditioned.  The  following  theorem  shows  that  if  the  interpolation  nodes 
are  chosen  to  be  quadrature  nodes  accurate  up  to  twice  the  bandwidth  of  interpolation, 
with  the  quadrature  formula  being  accurate  to  more  than  twice  as  many  digits  as  the 
interpolation  formula  is  to  be  accurate  to,  then  the  matrix  inverted  in  the  procedure  is 
close  to  being  a  scaled  version  of  an  orthogonal  matrix. 

Theorem  7.1  Suppose  the  numbers  wu  tu2, . . . ,  wn  6  R  and  xu  x2, . . . ,  xn  €  JR.  are  such 
that 


dx  —  Wje2lcaXj 
j= 1 


<£, 


(127) 
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for  all  a  €  [  1, 1],  and  for  some  c  >  0.  Let  the  matrix  A  be  given  by  the  formula 

(  Mxi)  MXl)  ■■■  Vn-l(Xi)  \ 

Ipofa)  ...  V'n-l  (Xi) 

A  =  :  ;  -  (128) 
V  Vo(xn)  A  (xn)  •••  ifn-l(Xn)  J 

let  the  matrix  W  be  the  diagonal  matrix  whose  diagonal  entries  are  Wi,W2 . wn,  and 

let  the  matrix  E  =  [e^]  be  given  by  the  formula 

E  =  I  -  A*W A. 

Then 

2e 


Proof.  Clearly 


(129) 

(130) 


u 

ejk  =  djk  ~J2wl  t )  Tpk-l(xi), 


1=1 


(131) 


where  5jj  is  the  Kronecker  delta  function.  Using  (18),  this  becomes 

ejk  =  (J— J^e~icXlt ^(t)  dtj 

■  **-'U*r) 

1  rl  rl  n 

=  5jk  ~  Xj-iXk  i  J- 1  J-i  J2  wie~icx,t  eicx,T  dtdr.  (132) 

Using  (127),  this  becomes 


eit  ~  1/-,  w  (i33) 

•  (y_ie-"<eic"ds  -  f,(t+ t))  dtdr, 
where  fe  :  [-2, 2]  ->  C  is  a  function  which  satisfies  the  relation 

l/e(*)l<£,  (134) 

for  all  x  e  [-2,2],  Thus 

r  i  r1  r1  r 1 

£jk  =  6jk  -  %^Xk~i  j-i  7—1  ^-i  w  ^-i  y_1  e‘icst  eicsr  dt  dr 

+x~^rl  L  L  *-i(r)  ^ + t>  **■  <135) 
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Using  (18),  this  becomes 


eik  -  Sjk  -  J  ipj-i(s)  ipk-i(s)  ds 

/.,  *-i(t)  L  w  + r)  *  *-• 

Due  to  the  orthonormality  of  the  functions  {^},  this  becomes 

eik  ~  T  \  [  ifij-i  W  /e(^  +  t)  dt  dr. 

Using  the  Cauchy-Schwartz  inequality,  this  becomes 


1 

IIV’Ar-lH^ 

1  f_ l  f  fe(t  +  t)  dt  dr 

Xj-\Xk-\ 

1 

)j f_  1  IK-ill2/i  \fe(t  +  r)\2  dt  dr 

1 

/a 

\fe(t  +  T)\ 2  dt  dr 

2e 

(136) 


(137) 


(138) 

□ 


From  inspection  of  Theorem  2.5,  it  can  easily  be  seen  that  the  number  N  of  eigenval¬ 
ues  needed  for  a  bandwidth  of  2c  and  an  accuracy  of  e 2  is  roughly  twice  the  number  of 
eigenvalues  needed  for  a  bandwidth  of  c  and  an  accuracy  of  e.  Thus  a  generalized  Gaus¬ 
sian  quadrature  for  a  bandwidth  2c  and  an  accuracy  s2  has  roughly  the  same  number 
of  nodes  as  are  needed  for  interpolation  of  accuracy  e.  In  our  numerical  experiments, 
this  correspondence  was  found  to  be  much  closer  than  the  rough  bounds  in  Theorem  2.5 
indicate:  in  the  results  tabulated  in  Section  8,  the  number  of  nodes  for  an  interpolation 
formula  of  a  desired  accuracy  e  was  always  chosen  to  be  the  number  of  quadrature  nodes 
for  a  desired  accuracy  e2  for  twice  the  band  limit  (that  number,  in  turn,  being  chosen 
as  indicated  in  Section  5);  the  correspondence  between  the  desired  accuracy  and  the 
experimentally  measured  maximum  error  can  be  seen  in  Tables  3  and  4. 

The  coefficients  Ci,C2,...,cn  produced  by  this  interpolation  procedure  (see  (126)) 
can,  of  course,  just  as  easily  be  used  for  evaluating  derivatives  or  indefinite  integrals  of 
the  interpolated  function,  as  they  can  for  computing  the  function  itself. 
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8  Numerical  Results 


The  algorithms  of  Sections  5—7  have  been  implemented  in  double  precision  (64-bit  floating 
point)  arithmetic,  with  results  shown  in  Tables  1-4.  Tables  1  and  2  show  the  perfor¬ 
mance  of  quadrature  nodes  produced  by  the  schemes  of  Sections  5  and  6,  when  used  as 
quadrature  nodes;  Tables  3  and  4  show  their  performance  when  used  as  interpolation 
nodes.  These  are  not  actually  the  same  sets  of  nodes;  even  with  the  bandwidth  c  for  in¬ 
terpolation  being  half  of  the  bandwidth  for  quadrature  (as  it  is  in  the  tables),  more  nodes 
are  needed  to  achieve  a  given  accuracy  of  interpolation  than  are  needed  to  achieve  a  given 
accuracy  of  quadrature,  as  can  be  seen  by  comparing  the  number  of  nodes  (printed  in 
the  column  labeled  n  in  each  table).  The  error  figures  in  the  tables  are  approximations 
of  the  maximum  error  of  interpolation  or  of  the  quadrature,  when  applied  to  functions 
of  the  form  cos(ax)  and  sin(arr),  with  0  <  a  <  c;  they  were  computed  by  measuring  the 
error  at  a  large  number  of  points  in  a  (for  interpolation,  in  both  a  and  x),  including  the 
extremes.  The  column  labeled  “Roots”  contains  the  errors  for  the  nodes  produced  by 
the  scheme  of  Section  6;  the  column  labeled  “Refined”  contains  the  errors  after  those 
nodes,  used  as  a  starting  point,  have  been  run  through  the  scheme  of  Section  5.  The 
variable  £  which  appears  in  the  tables  is  the  requested  accuracy,  used  to  determine  the 
number  of  nodes  in  the  ways  described  in  Sections  5  and  7. 

Also  tabulated  are  the  numbers  of  Legendre  nodes  required  to  achieve  the  same 
accuracy  £  using  polynomial  interpolation  or  quadrature  schemes.  Since  Chebyshev 
nodes  are  generally  known  to  be  superior  for  interpolation,  for  that  case  the  numbers  of 
Chebyshev  nodes  required  to  achieve  the  same  accuracy  are  also  tabulated. 

Figure  2  contains  the  maximum  norm  of  the  derivative  of  each  prolate  function  ipj(x), 
for  c  =  200  and  x  €  [  1,1],  as  a  function  of  j ';  also  graphed,  for  comparison,  is  the 
maximum  norm  of  the  derivative  of  each  normalized  Legendre  polynomial  Pj(x )  over 
the  same  range;  and  graphed  below,  on  the  same  horizontal  scale,  are  the  norms  of  the 
eigenvalues  A j.  The  graph  shows  that,  for  this  value  of  c,  computing  the  derivatives  of 
a  function  given  by  a  prolate  series  is  a  better-conditioned  operation  than  computing 
the  derivatives  of  a  function  given  by  a  Legendre  series  of  the  same  number  of  terms. 
(Obviously,  if  the  number  of  terms  can  also  be  reduced,  as  in  the  situations  of  Tables  1- 
4,  there  is  a  further  improvement  in  the  condition  number.)  The  same  general  pattern 
of  behavior  is  exhibited  for  other  values  of  c;  as  c  approaches  zero  (and  the  prolate 
functions  approach  the  Legendre  polynomials),  the  value  of  j  at  which  the  maximum 
norm  of  the  derivative  rises  sharply  also  approaches  zero  (as  is  to  be  expected,  since  for 
c  =  0  the  prolate  functions  reduce  to  Legendre  polynomials).  Finally,  Tables  5  and  6 
contain  samples  of  quadrature  weights  and  nodes. 

Remark  8.1  In  this  paper,  detailed  discussion  of  issues  encountered  in  the  implemen¬ 
tation  of  numerical  algorithms  has  been  deliberately  avoided,  as  well  as  any  discussion  of 
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CPU  time  requirements,  memory  requirements,  etc.  Thus,  we  limit  ourselves  to  observ¬ 
ing  that  all  algorithms  have  been  implemented  in  FORTRAN,  that  with  the  exception  of 
the  procedure  for  the  evaluation  of  Prolate  Spheroidal  Wave  functions  described  in  Sec¬ 
tion  4,  we  have  not  designed  or  implemented  any  new  or  original  numerical  algorithms, 
and  that  the  procedure  of  Section  4  consists  of  applying  standard  tools  of  numerical 
analysis  (diagonalization  of  a  tridiagonal  matrix)  to  the  well-known  recursion  (61).  The 
resulting  algorithm  for  the  evaluation  of  prolate  spheroidal  wave  functions  has  the  CPU 
time  requirements  proportional  to  c2,  with  a  fairly  large  proportionality  constant.  The 
procedure  of  [2],  when  applied  to  the  system  of  functions  . . . ,  ^2n+1  requires  order 

n3  operations,  also  with  a  fairly  large  proportionality  constant.  On  the  other  hand,  the 
cost  of  finding  all  roots  n  of  the  function  ipn  lying  on  the  interval  [-1,1]  is  proportional 
to  n,  and  the  proportionality  constant  is  not  large.  The  largest  c  we  have  dealt  with  in 
our  experiments  was  about  6000,  with  resulting  quadratures  having  about  1900  nodes. 
In  this  regime,  the  construction  of  the  quadrature  (both  nodes  and  weights)  took  several 
/VUVT  minutes  on  the  300-megaflop  SUN  workstation;  while  there  are  fairly  obvious  ways  to 
reduce  the  cost  of  the 'calculation  (both  in  terms  of  asymptotic  CPU  time  requirements 
and  in  terms  of  associated  proportionality  constants)  we  have  made  no  effort  to  do  so. 

The  following  observations  can  be  made  from  the  examples  presented  in  this  section, 
and  from  the  more  extensive  tests  performed  by  the  authors. 

1.  When  the  nodes  obtained  via  the  algorithm  of  [2]  are  used  for  the  integration  of  band- 
limited  functions,  the  resulting  quadrature  rules  are  significantly  more  accurate  than  the 
quadratures  obtained  from  the  nodes  of  appropriately  chosen  prolate  functions;  however, 
the  difference  between  the  numbers  of  nodes  required  by  the  two  approaches  to  obtain 
a  prescribed  precision  is  not  large.  When  the  nodes  obtained  via  the  two  approaches  are 
used  for  the  interpolation  (as  opposed  to  the  integration)  of  band-limited  functions,  the 
performances  of  the  two  are  virtually  identical. 

2.  For  large  c,  the  number  of  nodes  required  by  a  quadrature  rule  for  the  integration 
of  band-limited  functions  with  the  band-limit  c  is  close  to  f ;  the  dependence  on  the 
required  precision  of  integration  is  weak  (as  one  would  expect  from  Theorem  2.5  and 
subsequent  developments). 

3.  The  numbers  of  nodes  required  by  our  quadratures  rules  to  integrate  band-limited 
functions  is  roughly  7t/2  times  less  than  the  numbers  of  Gaussian  nodes;  the  numbers 
of  nodes  required  by  our  interpolation  formulae  in  order  to  interpolate  band-limited 
functions  is  roughly  7r/2  times  less  than  the  number  of  Chebychev  (or  Gaussian)  nodes. 

Again,  the  dependence  of  the  required  number  of  nodes  on  the  accuracv  requirements  is 
weak. 

4.  The  norm  of  the  differentiation  operator  based  on  our  nodes  is  of  the  order  c3/2,  as 
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compared  to  the  norm  of  the  spectral  differentiation  operators  obtained  from  classical 
polynomial  expansions;  this  might  be  useful  in  the  design  of  spectral  (or  pseudospectral) 
techniques. 


9  Miscellaneous  Properties 

Prolate  spheroidal  wave  functions  possess  a  rich  set  of  properties,  vaguely  resembling  the 
properties  of  Bessel  functions.  This  section  establishes  some  of  those  properties.  Some 
of  the  identities  below  can  be  found  in  [20], [17], [5];  others  are  easily  derivable  from  the 
former. 


The  identity 


eicxt  _  ^2  1pj(t),  (139) 

j=0 

(see  Section  5)  has  a  number  of  consequences  which,  while  fairly  obvious,  seem  worth 
recording,  since  similar  properties  of  other  special  functions  have  often  been  found  useful. 
Differentiating  (139)  m  times  with  respect  to  x  and  n  times  with  respect  to  t  yields  the 
formula 

/  1  \  (m+n)  oc 

*’"‘"‘‘“=(4)  (140) 

xlc/  j= 0 

for  all  x,  t  €  [—1,1].  Multiplying  (139)  by  e~lcut,  and  integrating  with  respect  to  f, 
converts  it  into 


sm(c  ■  (x  —  u))  c  ^  2 

Z  3=0 


X  -  u 


(141) 


Taking  the  squared  norm  of  (139),  and  integrating  with  respect  to  x  and  t,  yields  the 
formula 


OO 


£  W2 


=  4; 


combining  this  with  (21)  yields 


(142) 


(143) 
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(144) 


Setting  x  =  t  =  1  converts  (139)  into 
00 

e"  =  £A^(l). 

ji=0 


The  identity 

Xjipjix)  =  eicxt  ipj(t)  dt  (145) 

(see  Section  2.5)  also  has  a  number  of  simple  but  potentially  useful  consequences.  Dif- 
ferentiating  it  k  times  with  respect  to  x ,  we  get 

AjV>f ’( i)  =  (ic)*  £  e'“‘  t*  ^(t)  dt.  (146) 

We  next  consider  the  integral 

rl  gicii 

f(x)  =  f(a.x)  =  dt.  (147) 

Differentiating  (147)  with  respect  to  x,  we  have 
d  yi 

-f(a.x)  =  ,cj_—js(t)dt.  (148) 

Multiplying  (147)  by  tea.  and  subtracting  it  from  (148),  we  obtain 

-^f{a,x)-icaf{a,x)  =  ic j\icxt  ^(t)  dt  (149) 

=  icXjipj(x). 

In  other  words,  /  satisfies  the  differential  equation 

f\x)  -  icaf(x)  =  icXjipjlx).  (150) 

The  standard  variation  of  parameter’-  calculation  provides  the  solution  to  (150): 

f(x)  =  icXj  j*  e-ic^~%(t)  dt  +  /( 0)  eicax.  (151) 


Introducing  the  notation 


ic  dx 


(152) 


30 


(i.e.  V  is  the  product  of  multiplication  by  l/ic  and  differentiation),  we  rewrite  (146)  as 


to  =  Y  ff  tk  e‘“‘  ljjj(t)dt;  (153) 

for  an  arbitrary  polynomial  P  (with  real  or  complex  coefficients), 

P(D)(^)(x)  =  i£p(t)ei“'^(()di.  -  (154) 

By  the  same  token,  the  function  <j>  defined  by  the  formula 

/I  gicxt 

-i  Pif)^t)dt  (155) 

satisfies  the  differential  equation 

P(P)(<p)(x)  =  Xmi'm(x).  (156) 


The  following  lemma  provides  a  recursion  connecting  the  values  of  the  k-th.  derivative 
of  the  function  with  its  derivatives  of  orders  k  —  1,  k  -  2,  k  -  3,  k  —  4. 

Lemma  9.1  For  any  positive  real  c,  integer  m  >  0,  and  x  €  (— oo,  +oo), 


{l-x2)ip£+2)(x)-2{k  +  l)xip£+1\x) 

+  {Xm-k(k  +  1)  -  C2  X2) 

-2  c2kx^~1\x)  —  c2  k(k  —  1 )  if(Jf~2)  {x)  =  0 

(157) 

for  all  k  >2. 

Furthermore, 

(i  -  x2)  ip"(x)  -4xip'fn(x)  +  {xm-2-  c 2  a:2)  -ip'm(x) 

-  2  c2xifm(x)  =  0. 

(158) 

In  particular, 

"  2  (*  +  1)  V# +1)(1 )  +  (Xm  ~  k  (k  +  1)  -  c2)  ^W(l) 

-2  c2k  ^_1)  (1)  -  c2  k  {k  -  1)  ^~2)  (1)  =  0 

(159) 

for  all  k  >2,  and 

-2C(l)  +  (Xm-C2)fc(l)  =  0, 

(160) 

-  40)  +  (Xm  -  2  -  c2)  <(  1)  -  2c2 Ml)  =0. 

(161) 
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Furthermore,  for  all  integer  m  >  0  and  k>  2, 

^2)(0)  +  (Xm  ~  k  {k  +  1))  t/>W(0) 


-  c2  k(k  -  1)  ip!£  2)(0)  =  0  . 

(162) 

For  all  odd  m, 

^m(O)  +  (Xm  2)  ^m(0)  —  0  , 

(163) 

and  for  all  even  m, 

V’m(O)  +  Xm  ^m(O)  =  0  . 

(164) 

Finally,  for  all  integer  m  >  0,  k  >  0, 

lpm(  1)  #0, 

(165) 

4“h(0)=0, 

(166) 

«+1,(0)  =  0. 

(167) 

Proof.  All  of  the  identities  (157)  -  (164),  (166),  (167),  are  immediately  obtained  bv 
repeated  differentiation  of  (24). 

In  order  to  prove  (165),  we  assume  that 


^m(i)  -  0  (168) 

for  some  integer  m  >  0,  and  observe  that  the  combination  of  (168)  with  (159),  (160),  (161) 
implies  that 

^,m)(l)  =  0  (169) 

for  all  k  =  0, 1, 2, ... .  Due  to  the  analyticity  of  xbm(x)  in  the  complex  plane,  this  would 
imply  that 

i'm(x)  =  0  (170) 

for  all  x  e  R,1. 

□ 

The  following  is  an  immediate  consequence  of  the  identity  (160)  of  Lemma  9.1. 
Corollary  9.2  For  all  integer  m,n>  0, 

^m(l)  '  ^n(l)  ~  <(  1)  '  1)  =  (Xn  ~  Xm)  *  ^„(1)  •  ^m(l)  ,  (171) 

where  XmiXn  €  R.  are  as  defined  in  Theorem  2.6. 
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Theorem  3.1,  in  Section  3.1,  gives  formulae  for  the  entries  of  matrices  for  differen¬ 
tiation  of  prolate  series  and  for  multiplication  of  prolate  series  by  x.  Matrices  for  any 
combination  of  differentiation  and  of  multiplication  by  a  polynomial  can  obviously  be 
constructed  from  these  two  matrices;  for  instance,  calling  the  differentiation  matrix  D, 
and  the  multiplication-by-x  matrix  X,  the  matrix  for  taking  the  second  derivative  of  a 
prolate  series,  then  multiplying  it  by  5  -  x 2,  is  equal  to  (5 1  -  X2)D2. 

In  many  cases,  however,  there  are  simpler  formulae  for  the  entries  of  such  matrices, 
that  is,  for  inner  products  of  ipj(x)  with  its  derivatives  and  with  polynomials.  The  follow¬ 
ing  theorems  establish  several  such  formulae,  as  well  as  a  few  formulae  for  inner  products 
which  do  not  involve  ibj{x)  itself  but  only  its  derivatives.  We  start  with  Theorem  3.1. 
restated  here  for  consistency. 


Theorem  9.3  Suppose  that  c  is  real  and  positive,  and  that  the  integers  m  and  n  are 
non-negative.  If  m  =  n  (mod  2),  then 

J_x  Tp'nix)  4W  dx  =  J^xipn(x)  ipm(x)  dx  =  0.  (172) 

If  m  =£  n  (mod  2),  then 


r  i  o  A2 

J  Mx)  dx  =  —  TO  ^m(l)  V»n(l)  , 

1  * m  “r"  ^ n 

f_iXMx)ll>m(x)dx  =  ^(l)^n(l). 


(173) 

(174) 


Theorem  9.4  Suppose  that  c  is  real  and  positive,  and  that  the  integers  m  and  n  are 
non-negative.  Ifm^n  (mod 2),  then 


f_x  X  ^n(X)  d>m{x)  dx  =  0  .  (175) 

If  m  =  n  (mod  2),  then 

j_^XlP'n (x)ipm{x)  dx  =  (2^m(l)^n(l)-<5mn)  •  (176) 

Proof.  Identity  (175)  is  obvious  since  the  functions  ifij  are  alternately  even  and  odd  (see 
Theorem  2.4).  In  order  to  prove  (176),  we  consider  the  integral 

J  X  ^n(z)  dx 

=  TnLiX  {I-i  e<CXt^»W  dt)  Mx)  dx 
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=  ^  (^J  ^tpn(t)elcxt  dtj  dx 

=  YnLi  t  (/i  xMx)eicxt  dx )  V’nW  di 

=  7^/^  t^'m{t)^n{t)  dt. 

In  other  words, 

/,  dx=^  /_)  ztfMMx)  dx.  (177) 

On  the  other  hand,  integrating  the  left  side  of  (177)  by  parts,  we  obtain 

x  lf'n(x)  Xpm(x)  dx 

=  2lpm(l)  ^n(l)  -  [  (lpn(x)  Ip'mix)  X  +  1pn(x)  dx 

=  2^m(l)V>„(l )-£  xxfn(x)^m(x)  dx-6mn. 

Combining  (177)  and  (178),  we  have 

f_  1X7P'm(x)'lpn(x)  dx 

=  2lpm(l)  1pn(l)  —  X  lf'm(x)  lfn{x)  dx  -  8mn  , 
from  which  (176)  follows  directly.  □ 


Theorem  9.5  Suppose  that  c  is  real  and  positive,  and  that  the  integers  m  and  n  are 
non-negative.  If  m  ^  n  (mod  2),  then 


X2lPn(x )t/>m(x)  dx 


=  0. 


(178) 


If  m-n  (mod  2)  and  m^n,  then 


[  z2tC(x)  ^n(x)  dx 

J  —  l 

2A 

=  7-rr  woxw-cwwi)) 

Am  An 

4A 

-T— V*.(  l)^m(l) 

An  -+*  Am 


(179) 
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(180) 


\  _  \  {Xn  Xm)  IPn  (l)  t^m(l) 

/'m  n 

~  7  ~7  ^n(l) 

'•n  T  /\m 

where  Xm,Xn  €  Bl  are  as  defined  in  Theorem  2.6. 

Proof.  Clearly  (178)  is  true,  since  the  functions  fi>j  are  alternately  even  and  odd.  In 

order  to  prove  (179)  and  (180),  supposing  that  m  =  n  (mod  2)  and  m  ^  n.  we  consider 
the  integral 

[\2^{x)xbm{x)dx 

=  Yn  !~ix2'  ( 1-1  eiCXt^  dt)x  ^ ra{x )  dx 

=  ~Ynl-i  M*)  X2  ■  t21pn(t)  eUxt  dtj  dx 

c2  fl  /  rl  .  \ 

=  ~T~  (/  ^m(x)x2elcxtdx)  Ipn(t)t2  dt 

J— i  w— 1  / 

=  ^  £  <2 *.(«)<(«)  dt , 
which  is  summarized  as 

/J  \  | 

,x  x2  ^n(x)  i>m(x)  dx=  ~  X2  l/j^x)  Tpn(x)  dx  .  (181) 

On  the  other  hand,  integrating  the  left  side  of  (181)  by  parts,  we  have 
£  Z2  K(x)  ’i'm(l)  dx 

=  2<(l),/,m(l)-£  <(i)  (^(x)I2  +  2x^„W)  & 

=  2  <(1)  (1)  -  2  £  <(z)  xdx 

-£  (182) 

Due  to  Theorem  9.4  and  the  fact  that  m  ^  n,  we  immediately  rewrite  (182)  as 

9  A 

=  2^(l)lWl)-r^2«l)lMl) 

r  An 

~  /j  X2  ^n(X) 'PM  dx,  (183) 
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which  we  rewrite  as 

f_t  X2  <K{x)  i>'m(x)  dx 

=  2<(l)V-m(l)- -^-*(1)^(1) 

Am  I  An 

-f  X2  ^n(x)  Tpm(x)  dx  .  (184) 

Swapping  m  with  n,  we  convert  (184)  into 
j_x  x1^!)  i  'TJx)  dx 

Am  i“  An 

-  f  X-  v"(x)^n(x)  dx.  (185) 

Combining  (184)  and  (1S5).  we  obtain 

[  V,rM)  dx-  2ti'n(l)xPm(l)+  ^n(l)^m(l) 

1  Am  i  An 

=  J  x2u^(x>vn(i)  dx- 2  ^(1)^(1)  +  -i^L-^n(l)^m(l),  (186) 

which  is  obviously  equivalent  to 

/  x2b'"(x)t m{x)  dx 

=  /_)  x2  L-;(r)  e„(x)  dx  +  2  (V4(l)  v„(l)  -  « 1)  </>„(l)) 

Aji  "T 

Finally,  combining  (181)  with  (187),  we  have 
/i  x2^m(x)  ^n(^)  dx 

=  /i  l2  W  +  2  «(1)  ^m(l)  -  C(l)  i(l)) 

+  4  +  (187) 
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which  is  easily  rewritten  as 


(H/: 


x2ib'^ (x)ipn(x)  dx 


=  2(^(l)^m(l)-^(l)^n(l)) 
+  4  Y  T  Y  ^nC1)  ^m(l)  , 

-*r 


or 


(x)lPn{x)  dx 
2  A„ 


4  An 


4“  \ 


mvMv-ip'MMi)) 

■V*»( l)V'm(l)  • 


(188) 


We  finally  rewrite  (188)  as  (180)  using  Corollary  9.2.  □ 

The  following  theorem  is  an  immediate  consequence  of  combining  the  preceding  theorem 
with  equation  (184)  from  its  proof. 

Theorem  9.6  Suppose  that  c  is  real  and  positive,  and  that  the  integers  m  and  n  are 
non-negative.  If  m^n  (mod  2),  then 


J^x2ib'n(x)  dx  =  0. 

If  m  =  n  (mod  2)  and  m  n, 

dx 


(189) 


=  2C(1)«1)  + 


2A„ 


(^m(l)^(l)-  <(1)^(1)) 


=  2^(l)^m(l)  + 


2Ar 


An  Ar 


«(l)^m(l)-^(l)^n(l)) 


~  ^m(l)  'iAi(l) 


^mXm  AnXn 
.  Am  Ar 


-  c 


(190) 

(191) 

(192) 


Theorem  9.7  Suppose  that  c  is  real  and  positive,  and  that  the  integers  m  and  n  are 
non-negative.  If  m^n  (mod  2),  then 


J_x  ^n(x)  COO  dx  =  J  X2  ll>n(X)  Tpm{x)  dx  =  0 


(193) 
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If  m  =  n  (mod  2)  and  n,  then 


[  Ipn(x)  il>n(x)  dx 

J  —  1 

2  A2 

=  M,(i)<Mi)- Mi)4CW) 

(194) 

A2 

\2  __  ^2  (^n  Xm)  '0m(l)  ^(l)? 

(195) 

[  ^x2ipn(x)ipm(x)  dx 

=  C2  xC-K  WW^W-W1)^1)) 

(196) 

c2  ^2  _  ^2  Xm)  '0m(l) 

(197) 

where  Xm,Xn  £  R.  are  as  defined  in  Theorem  2.6. 

Proof.  Identity  (193)  is  obvious,  since  the  functions  ipj  are  alternately  even  and  odd. 
In  order  to  prove  (194)-(197),  we  start  with  the  expression 

Xn'if'nix)  =  -c2  f  t2eicxtipn(t)  dt.  -  (198) 

Taking  the  inner  product  of  (198)  with  we  have 

An/_1  ip'fix)  1pm(x)  dx 

-  ~°2  j_ J  (Xx  t2ipn(t)eicxt  dtj  ipm{x)  dx 

=  ~°2  !\  (£  Mx)eicxtdxSj  dt 

=  -C2\m  [  t2  ll>n(t)  tm(t)  dt , 

J  —  1 

which  we  summarize  as 

j,  X2  M*)  Tpm(x)  dx  =  ~  ^  f*  ^"(x)  ^m(X)  fa  .  (199) 

1  C  Am  J  —  1 

Swapping  n,  m,  we  rewrite  (199)  in  the  form  of 

f  X  ljjn  (yX')  ifrn {x^jdx 

J  —  1 

=  h  ^m{x)Vn{x)  dx.  (200) 
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(201) 


Combining  (199)  and  (200),  we  get 

£«*)*.(*)  ix=%  £  <M4(i)  dx. 

On  the  other  hand,  integrating  the  left  side  of  (201)  by  parts,  we  have 
/  i'n(x)  1pm(x)  dx 

J  —  1 

=  ^n(l)  ^m(l)  -  V£(-l)  V’m(-l)  ~  J  ^  lb'n{x)  Tf'm{x)  dx 

=  2  <(1)  ^m(l)  -  (^(1)  C(l)  -  ^n(-l)  ^(-1)) 

+  /_1#)C(1)^'  (202) 

We  rewrite  (202)  in  the  form  of 

J_x  in(x)  ^m(x)  dx 

=  2(<(l)^m(l)-^n(l)C(l))+/_1i^n(x)C(^)^- 

We  combine  (201)  and  (203)  and  get 

(^r  “  ^  j_x  ^n{x)  lp'Jn{x)  dx 

=  2  «(l)^'m(l) -^(1)^(1))  •  (203) 

Since  m  ^  n,  we  easily  rewrite  (203)  as  (194).  We  obtain  expression  (196)  by  combin¬ 
ing  (200)  and  (194).  The  identities  (195),  (197)  follow  from  (194),  (196)  immediately 

due  to  Corollary  9.2.  □ 


Theorem  9.8  Suppose  that  c  is  real  and  positive,  and  that  the  integers  m 
non-negative.  Let 

and  n  are 

V n{y)=[  Ipn (x)dx. 

JO 

(204) 

If  n  is  odd  and  m  is  even,  then 

j  x  jtPn(t)ipm(t)  dt 

(205) 

2  A Yjr  \n 

(206) 

+  2  a;  +  as,  *"(1)  L  iMt)dt 

(207) 
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(208) 


If  m  =  n  (mod  2),  then 

J_x  \  i’nit)  1pm(t)  dt  =  0. 


Proof.  We  start  with  the  identity 


An  V'n(x)  =  J  i  eicXt  ^n(t)  dt  . 

(209) 

Integrating  (209)  with  respect  to  x,  we  have 

ry 

An  /  ^n(x)  dx 

Jo 

=  JQ  (^f  ^elcxt  ipn(t)  dtj  dx 

(210) 

=  j\ixct  dxdt 

(211) 

(212) 

which  we  summarize  as 

A„  *.(»)  =  f  j\  1 M*)  «**  dt  -  i  j\  i  Mt)  dt . 

(213) 

Taking  the  inner  product  of  (213)  and  we  obtain 


Li  dV 

~  h  Mt)  ■  (/,  J  V’n(t)  c""'  it)  rfy 

~i7  /_,  '  (/_,  7  dt)  dV  (2M) 

=  kL\Mt)\rj'y,^iy)it 

1  r1  1-  ri 

~ ic  7-1  7  dt '  /_!  (215) 

=  if  J_x  \^n(t)7pm(t)  dt 

~hl-i  70n(*}  dt-  j_^rn(y)  dy,  (216) 
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which  we  summarize  as 


r 1  1 

y_l  -  Vn(t)  lpm(t)  dt 

=  i  c  T~  [  ^n(t)^m(t)dt 

A m  J —  1 

1  fl  1  fl 

+  J-i  1  ^ n ^  dt '  /_!  dy  '  (217) 

Exchanging  m  with  n.  we  convert  (217)  into  ■ 

71  1 

y  -  Vm{t)  1pn(t)  dt 

=  i  C  Y1  f  tp n(t )  dt 

J —  1 

If 1  1  fl 

~~n  j -i  i  ^ m ® dt '  L  ’  (218) 


and  combining  (217).  (218),  we  get 

T1;  cf  *n{t)lpm(t)  dt~Y±ic  [l  *m{t)ll>n(t)  dt 
J  - 1  An  ;  7-1 

=  t  L\  \  dt '  Lx  ^ n ^ 

1  fl  1  rl 

~A^  7-1  dt-  J^rait)  dt. 


(219) 


Suppose  that  m  is  even  and  n  is  odd;  then  the  first  product  in  the  right  hand  side  of  (219) 
is  zero,  so 


T-*c  f  «n(t)4W  dt-^ic  fl  tfm(t)^n(f)  dt 

-'-l  An  7  —  1 

1  Z*1  1  yl 

=  ~X^J.1t  dt '  7-1  ^  ’  (22°) 

which  is  equivalent  to 

f  ^n(t)  'b’m(^)  dt 

J  —  1 

A2  r1 

=  J m{t)  lpn{t)  dt 

~Tnh  /-,  7 Wt)  dt  ■  £  *  •  (221) 
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or 


f  ^m(t)^n(t)dt 

J  —  1 

X 2  /*i 

=  JT  J_i'&n(t)'lpm(t )  dt 

+  ^thLl Mt)  dt '  /-,  dt  ■  (222) 

On  the  other  hand,  integrating  the  left  side  of  (222)  by  parts,  we  obtain 

[l  *m{t)ij;n(t)  dt 

*/  —  1 

=  #m(l)  -  ^n(-l)^m(-l)  -  dt.  (223) 

Since  the  product  ^7Tl(x)  ,I,n(x)  is  an  odd  function  when  m  ^  n  (mod  2),  we  rewrite  (223) 
as 

£  *m(0  M)  dt 

=  2tfn(l)tfm(l)  -  dt.  (224) 

The  combination  of  (222)  and  (224)  implies  that 

*n(*)  Mt)  dt+X±  £  *n(t)  dt 

=  2  ^n(l)  ^m(l)  —  JY  —  —  1pn(t)  dt  ■  ipm(t)  dt ,  (225) 

or 

\2  ,  \  2  r\ 

--mA2  ”  dt 

=  2  tf„(l)  ^m(l)  M)  dt  ■  £  rpm{t)  dt ,  (226) 

which  is  equivalent  to 

/  dt 

d  -  1 

2  A2 

A* +A2,  *»(1),M1) 

"  Jfni  h  £  7  dt  ■  (227> 
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Finally,  combining  (217)  and  (227),  we  have 

rl  1 

y_i  -^n(t)^m(t)  dt 


AJ  +  AJ, 


A2  +  A2  7.!  f 


1pn(t)  dt'J  lPm(t)  dt  . 


(228) 


Equation  (208)  is  easily  proven  since  the  product  }  ipm(x)  ipn(x)  is  an  odd  function  when¬ 
ever  m  =  n  (mod  2).  □ 

The  above  theorems  do  not  use  much  of  the  detailed  structure  of  the  integral  operators 
of  which  the  functions  {ipj}  are  eigenfunctions.  Thus  many  of  them  generalize  easily  to 
the  case  of  an  operator  L  :  T2[ 0, 1]  — >  T2[ 0, 1]  defined  via  the  formula 


L(ib)(x)  =  ^  K(xt)  xp(t)  dt,  (229) 

for  some  function  K  :  [0, 1]  — >  C;  the  following  theorem  is  an  example  of  this. 

Theorem  9.9  Let  X1,X2  be  two  eigenvalues  of  the  operator  L  defined  by  (229),  that  is, 

JQ  K(xt)  ipfit)  dt  =  Xi^i  (x),  (230) 

fQ  K{xt)  ip2(t)  dt  =  X2fi)2 (x) .  (231) 


(230) 


(231) 


Then 


X2  Jo  xi;[{x)ij;2(x)  dx 

T~  =  71 - >  (232) 

/  xip2(x)  ifi(x)  dx 

J  0 

provided  that  neither  Ai  nor  the  denominator  of  the  right  hand  side  of  (232)  is  zero. 
Proof.  Differentiating  (230),  (231)  with  respect  to  x,  we  get 

fQ  tK’(xt)^i(t)  dt  =  Xiip[  (x),  (233) 

fQ  t  K' (xt)  ijj2(t)  dt  =  X2ip'2(x).  (234) 

Multiplying  (233)  by  xip2  (z),  we  have 

rl 

Ai  X  i>[ (x)  i>2 (x)  =xf)2(x)  /  t  K' (xt)  fa (t)  dt.  (235) 

JO 


(233) 

(234) 
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Integrating  on  the  interval  [0, 1],  we  obtain 

XlIo  dx  =  £xih(x)  J^tK'ixQi/nit)  dtdx  (236) 

=  JQ  xK'(xt)ih(x)  dxdt.  (237) 

Renaming  the  variables  of  integration  on  the  right  hand  side  from  x  to  t  and  vice  versa, 
we  get 

r 1  -i  rY 

Aj  ^  xib[(x)^2{x)  dx  =  xfa (x)J  tK'(xt)Tp2(t)  dtdx.  (238) 

Substituting  (234)  into  (238),  we  obtain 

fQ  x ip[(x)  ip2{x)  dx  =  A 2  x  tpi(x)  7p'2(x)  dx,  (239) 

from  which  (232)  follows  immediately,  as  does  its  caveat.  □ 


The  following  theorem  establishes  the  relation  between  the  norm  of  each  function  ^ 
on  [-1, 1]  (which  in  this  paper  is  taken  to  be  one),  and  its  norm  on  (-oo,  oo). 

Theorem  9.10  Suppose  that  c  is  real  and  positive,  and  that  the  integer  n  is  non¬ 
negative.  Then 

/OO  2 

Jl(x)dx=~.  (240) 

where  pn  is  given  by  (21 ). 

Proof. 


-  /*.(£/> 

=  [  Vl(t)  dt 

Un  — 1 


hn 

J_ 

hn 


□ 


The  following  theorem  extends  Theorem  (9.10)  to  any  band-limited  function  with 
band  limit  c. 
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Theorem  9.11  Suppose  that  c  is  real  and  positive,  that  the  integer  n  is  non-negative, . 
and  that  f  :  R.  ->■  C  is  a  band-limited  function  with  band  limit  c.  Then 


Proof. 


[  xpn(x)  f(x)  dx-  —  [  ifn(x)  f(x)  dx. 
J-oo  pLn  J- 1 

/cc 

1pn{x)  f(x)  dx 

-oc 


=  r  p-  r 

j- 00  \7 Tgin  J- 1 


1  r1  sin(c  •  {x  —  t) ) 


x  —  t 


ipn{t)  dt  f(x)  dx 


1  fl  ,  ( 1  /’oc  sinfc •  (x  —  t))  .  \ 

-  —  /  •  -  /  - - — —  f{x)  dx)  dt 

Hn  J \x  J-oo  X  -t  J 

=  —  [  n(t )  f(t)  dt. 

Un  J  —  l 


(241) 


□ 


Theorem  9.12  Suppose  that  c  is  real  and  positive,  and  that  the  integer  n  is  non¬ 
negative.  Then 


l^eicxtipm(t)dt  =  \ 


{  —1pm(x), 

Mm 

L  o. 


if  -  1  <  x  <  1, 
if  x  >  1  or  x  <  — 1. 


(242) 


Proof.  Since  is  an  eigenfunction  of  the  operator  Qc  defined  in  (19),  and  p.m  is  the 
corresponding  eigenvalue, 


Thus 


/■AW  =  -  f  — ~  u))  iMu)  du. 

7 r  J- 1  x  —  u 


/oo 

eicxtipm(t)  dt 

-00 

_  1_  f°°  jcxt  ( 1  f1  sin{c  ■  (x  -  «)) 

J-oo  {ttJ-i 


Mm  J  —oc 


x  -  u 


(243) 


lM«)  du  J  dt  (244) 

-  <*» 
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Since  the  innermost  integral  is  the  orthogonal  projection  operator  onto  the  space  of 
functions  of  band  limit  c  on  (—00,00),  applied  to  the  function  elcxt ,  it  follows  that: 

r  e,cii4w  dt 

j —00 


[  — ~  [  'P m{u )  eh 
I  7-1 

.  0, 


from  which  (242)  follows  immediately. 


\)  du 

£  >  1  or  x  <  —  1  j  J 

(246) 

if  -  1  <  x  <  1, 

(247) 

if  x  >  1  or  x  <  -1, 

□ 


The  following  five  theorems  establish  formulae  for  the  derivatives  of  prolate  functions 
and  their  associated  eigenvalues  with  respect  to  c. 


Theorem  9.13  For  all  positive  real  c  and  non-negative  integer  m, 

d_K_,  2^(1)  -1 

Sc  ~Am  2^  ’ 

Proof.  We  start  with 


(248) 


*mTpm(x)=  J\icxtipm(t)  dt. 

Differentiating  (249)  with  respect  to  c,  we  obtain 


(249) 


dKn 

dc 


Ip  mix')  +  Am 


d%pm(x) 

dc 


=  f1  ixteicxtxpm(t)dt+  f1  eicxt^^-dt. 

s-i  7- 1  dc 

Multiplying  by  pm{x)  on  both  sides  of  (250),  and  integrating  on  the  interval 
get 


(250) 
[-1,1],  we 


I- 1  +  dx 

=  f  ipm(x)  J  i  X  t  elcxt  Ipm(t)  dt  dx 


(251) 


46 


which  we  rewrite  as 

d\m 


1  dvm{x) 


1pm(x)  dx 


=  J^itipmit)  j  eicxt  X  xpm  (x)  dx  dt 

+  f_x  J  x  elcxtipm(x)  dx  dt 

=  Xm  f  i  tipm(t)  —  dt 

J-i  ic  at 


f1  dWrnjt) 
Li  dc 


i> m(t )  dt, 


which  we  summarize  as 

dXm  _  Xm  [i  dxpm(t) 

On  the  other  hand,  integrating  the  right-hand  side  of  (254)  by  parts,  we  have 
f1  +*l,  d'&mit)  j. 


-  V’mC1)  +  V-ml-1)  ~  1  -  [  $ m(t)t 

J  —  1 


dlbm(t) 


which  we  rewrite  as 


£  ^§T  dt  =  V£(l)  -  i 

Finally,  substituting  (256)  into  (254),  we  get 

2^(1)  -1 


(255) 


(256) 


(257) 


Theorem  9.14  For  any  positive  real  c  and  non-negative  integer  m, 

2  ,2  /,  \ 

-£-  =  -w»iUU- 


(258) 
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(259) 


Proof.  We  start  with  the  identity 


_2ct 

Mm  Am  Am. 

7T 


Differentiating  (259)  with  respect  to  c,  we  get 


9 /-I'm 
dc 


—  (l  dXm  _L  \ 

7T  l  m  dc  m  DC 


2- 

+  Am  Am 
7 r 


Substituting  Lemma  9.13  into  (260),  we  get 


<9/x 


5c 


”>  -  ?£.,T  A  ,  2t 

~  ^m  ~  I - Am  Ar 


=  2/im 


2c 

2*4(1)  -  1  ,  1 


7T 


2  c 


Mm 

C 


2  11. 

Mm  ^m(^)  ~  Mm  H  Mm 

c  c  c 


The  following  theorem  immediately  follows  from  Theorems  9.13  and  9.14. 
Theorem  9.15  For  all  positive  real  c  and  non-negative  integer  m,  n, 


Theorem  9.16  Suppose  that  c  is  real  and  positive,  and  the  integers  m,n 
negative.  If  m  ^  n,  then 

£  *•<«>  ^<() <# = - 1  *.(D  *.(D  ■ 

If  m  =  n,  then 

J_ i  ^m(i)  ^p-(i)  dt  =  0. 


(260) 

(261) 

(262) 

□ 

(263) 

(264) 
are  non- 

(265) 

(266) 
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Proof.  Since  the  norm  of  xbn  on  [—1, 1]  remains  constant  as  c  varies.  wn  must  be  orthog¬ 
onal  on  [-1. 1]  to  its  own  derivative  with  respect  to  c.  which  immediately  yields  (266). 
To  establish  (265),  we  start  with  the  identity 

K^n(x)  =  J  eicxt  yn(t)  dt .  (267) 

Differentiating  (267)  with  respect  to  c,  we  get 


dK -Jh(x)  +  A„  H'n 


dc 


dc 


=  [ixtelcxti;n{i)+elcxt 


dc 


dt . 


(268) 


Multiplying  both  sides  of  (268)  by  ipm(x)  and  integrating  with  respect  to  x,  we  have 


\  f1  „/.  d^n{x) 

Xn  J  i  ^m\%)  dx 

=  dx  +  xmj^m(t)  dt,  (269) 

which,  using  (176),  we  rewrite  as 


(A»  -  A„)  £  ,Mi)  dt 


Xjl  \ 


c  Xm  +  Xn 


(2^»(l)^(l  )-Smn). 


Assuming  that  m  #  n,  and  dividing  by  An  -  Xm,  we  then  get  (265). 


(270) 

□ 


Theorem  9.17  Suppose  that  c  is  real  and  positive ,  and  the  integer  m  is  non-negative. 
Then 


dXm 

dc 


(271) 


Proof.  Due  to  Theorem  2.6, 

(!  -  x2Wmix)  ~  2 Xip'm(x)  +  (Xm  -  c2x2)  ipm(x)  =  0.  (272) 

Making  the  infinitesimal  changes  c  =  c  +  h,  Xm  =  Xm  +  e,  and  ipm{x)  =  j,m(x)  +  S(x), 
this  becomes 


(1  -  x2)  •  (ip'm(x)  +  S"(x))  -  2x  ■  ( ip'm(x )  +  <5'(;r)) 

+  (Xm  +  e  -  (c  +  h)2x2)  ■  (ipm(x)  +  5(x))  =  0.  (273) 
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Expanding  each  term,  discarding  infinitesimals  of  the  second  order  or  greater  (that  is, 
products  of  two  or  more  of  the  quantities  h,  e,  and  5(x)),  and  subtracting  (272),  we  get 

(1  -  x 2)  5  (x)  -  2 xS'(x)  +  (xm  -  c2x2)  S(x)  +  (e  -  2 chx2)ibm{x)  =  0.  (274) 

Let  the  self-adjoint  differential  operator  L  be  defined  by  the  formula 

£(/)(z)  =  (1  -  x2)f"(x)  -  2 xf'(x)  +  (Xm  -  C2x2)f(x).  (275) 

Then,  multiplying  (274)  by  ipm{x)/h  and  integrating  on  [-1, 1],  we  get 

Li  dx  +  l~  L,  2cx2^m(*)  =  0.  (276) 

Now  |  In  addition,  since  L  is  self-adjoint, 

dx~  dx.  (277) 

But  due  to  (272),  L(tjjm)(x)  =  0  for  all  x  €  [-1,1],  so  the  integral  (277)  is  zero. 
Thus  (276)  becomes 

& Xm  0  2  1 2  f  \ 

=  2  CJ_lX^rn(xY  ,  (278) 


10  Generalizations  and  Conclusions 

In  this  paper,  we  design  quadrature  rules  for  band-limited  functions,  based  on  the  prop¬ 
erties  of  Prolate  Spheroidal  Wave  Functions  (PSWFs),  and  the  connections  of  the  latter 
with  certain  fundamental  integral  operators  (see  (17),  (19)  in  Section  2.5).  The  quadra¬ 
tures  are  a  surprisingly  close  analogue  for  band-limited  functions  of  Gaussian  quadratures 
for  polynomials,  in  that  they  have  positive  weights,  are  optimal  in  the  appropriately  de¬ 
fined  sense,  and  their  nodes,  when  used  for  approximation  (as  opposed  to  integration), 
result  in  extremely  efficient  interpolation  formulae.  Thus,  Sections  5-7  of  this  paper  can 
be  viewed  as  reproducing  for  band-limited  functions  much  of  the  standard  polynomial- 
based  approximation  theory  (for  which  see,  for  example,  [24]).  Generally,  there  is  a 
striking  analogy  between  the  band-limited  functions  and  polynomials. 

Obviously,  there  are  certain  differences  between  the  resulting  apparatus  and  the  stan¬ 
dard  numerical  analysis.  To  start  with,  where  the  classical  techniques  are  optimal  for 
polynomials,  the  approach  of  this  paper  is  optimal  for  band-limited  functions;  whenever 
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the  functions  to  be  dealt  with  are  naturally  represented  by  trigonometric  expansions  on 
finite  intervals,  our  quadrature  and  interpolation  formulae  tend  to  be  more  efficient  than 
those  based  on  the  polynomials.  When  the  functions  to  be  dealt  with  are  naturally  rep¬ 
resented  by  polynomials,  the  classical  approach  is  more  efficient;  however,  many  physical 
phenomena  involve  bend-limited  functions,  and  very  few  involve  polynomials. 

Qualitatively,  the  quadrature  (and  interpolation)  nodes  obtained  in  this  paper  behave 
like  a  compromise  between  the  Gaussian  nodes  and  the  equispaced  ones:  near  the  middle 
of  the  interval,  they  are  very  nearly  equispaced,  and  near  the  ends,  they  concentrate 
somewhat,  but  much  less  than  the  Gaussian  (or  Chebychev)  nodes  do.  For  large  c.  the 
distance  between  nodes  near  the  ends  of  the  interval  is  of  the  order  with  the  total 
number  of  nodes  close  to  f .  In  contrast,  the  distance  between  the  Gaussian  nodes  near 
the  ends  of  the  interval  is  of  the  order  with  n  the  total  number  of  nodes.  A  closely 
related  phenomenon  is  the  reduced  norm  of  the  differentiation  operator  based  on  the 
prolate  expansions:  for  an  n-point  differentiation  formula,  the  norm  is  of  the  order  n3/2, 
as  opposed  to  n2  for  polynomial-based  spectral  differentiation.  Thus,  PSWFs  are  likely 
to  be  a  better  tool  for  the  design  of  spectral  and  pseudo-spectral  techniques  than  the 
orthogonal  polynomials  and  related  functions. 

Much  of  the  analytical  apparatus  we  use  was  developed  more  than  30  years  ago 
(see  [20]-[21],  [17],  [18]);  the  fundamental  importance  of  these  results  in  certain  areas  of 
electrical  engineering  and  physics  has  also  been  understood  for  a  long  time.  However, 
there  appears  to  have  been  no  prior  attempt  made  to  view  band-limited  functions  as  a 
source  of  numerical  algorithms.  Generally,  there  is  a  fairly  limited  amount  of  information 
in  the  literature  about  the  PSWFs,  especially  when  compared  to  the  wealth  of  facts  on 
man\  other  special  functions.  Section  9  of  this  paper  is  an  attempt  to  remedy  this 
situation  to  a  small  degree. 

The  apparatus  built  in  this  paper  is  a  strictly  one-dimensional  one.  Obviously,  one 
can  construct  discretizations  of  rectangles,  cubes,  etc.  by  using  direct  products  of  one¬ 
dimensional  grids;  the  resulting  numerical  algorithms  are  satisfactory  but  not  optimal. 
Furthermore,  representation  of  band-limited  functions  on  regions  in  higher  dimensions 
is  of  both  theoretical  and  engineering  interest.  Obvious  applications  include  seismic 
data  collection  and  processing,  antenna  theory,  NMR  imaging,  and  many  others.  When 
the  region  of  interest  is  a  sphere,  most  of  the  necessary  analytical  apparatus  can  be 
found  in  [21],  At  the  present  time,  we  have  constructed  and  implemented  somewhat 
rudimentary  versions  of  the  relevant  numerical  algorithms;  we  are  conducting  numerical 
experiments  with  these,  and  will  report  the  results  at  a  later  date.  A  much  more  difficult 

set  of  questions  is  presented  by  the  structure  of  band-limited  functions  on  more  general 
regions. 
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Table  1:  Quadrature  performance  for  varying  band  limits,  for  e  =  10~7 


c 

n 

Maximum  Errors 

^pol 

Roots 

Refined 

10.0 

9 

0.96E-05 

0.51E-07 

13 

20.0 

13 

0.17E-04 

0.94E-07 

19 

30.0 

17 

0.12E-04 

0.50E-07 

25 

40.0 

20 

0.70E-05 

0.30E-06 

31 

50.0 

24 

0.35E-05 

0.83E-07 

37 

60.0 

27 

0.25E-04 

0.27E-06 

43 

70.0 

31 

0.11E-04 

0.66E-07 

48 

80.0 

34 

0.48E-05 

0.17E-06 

54 

90.0 

38 

0.21E-05 

0.40E-07 

59 

100.0 

41 

0.12E-04 

0.91E-07 

65 

200.0 

74 

0.24E-05 

0.86E-07 

118 

300.0 

106 

0.32E-05 

0.21E-06 

171 

400.0 

139 

0.52E-05 

0.62E-07 

223 

500.0 

171 

0.56E-05 

0.88E-07 

275 

600.0 

203 

0.58E-05 

0.11E-06 

326 

700.0 

235 

0.57E-05 

0.12E-06 

377 

800.0 

267 

0.55E-05 

0.13E-06 

428 

900.0 

299 

0.53E-05 

0.14E-06 

479 

1000.0 

331 

0.50E-05 

0.14E-06 

530 

1200.0 

395 

0.44E-05 

0.13E-06 

632 

1400.0 

459 

0.38E-05 

0.11E-06 

734 

1600.0 

523 

0.31E-05 

0.97E-07 

835 

1800.0 

587 

0.28E-05 

0.80E-07 

937 

2000.0 

651 

0.23E-05 

0.64E-07 

1038 

2400.0 

778 

0.29E-05 

0.15E-06 

1240 

2800.0 

906 

0.19E-05 

0.84E-07 

1442 

4000.0 

1288 

0.37E-05 

0.17E-06 

2047 
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Table  2:  Quadrature  performance  for  varying  precisions,  for  c  =  50 


e 

n 

Maximum  Errors 
Roots  Refined 

■Wpol 

0.10E-01 

19 

0.45E-01 

0.10E-01 

30 

0.10E-02 

20 

0.70E-02 

0.13E-02 

32 

0.10E-03 

21 

0.91E-03 

0.14E-03 

33 

0.10E-04 

22 

0.82E-04 

0.13E-04 

34 

0.10E-05 

23 

0.54E-04 

0.11E-05 

36 

0.10E-06 

24 

0.35E-05 

0.83E-07 

37 

0.10E-07 

25 

0.33E-05 

0.57E-08 

38 

0.10E-08 

26 

0.18E-06 

0.36E-09 

39 

0.10E-09 

26 

0.18E-06 

0.36E-09 

40 

0.10E-10 

27 

0.17E-06 

0.21E-10 

42 

0.10E-11 

28 

0.79E-08 

0.11E-11 

43 

0.10E-12 

29 

0.78E-08 

0.56E-13 

45 

0.10E-13 

30 

0.31E-09 

0.27E-14 

55 

55 


Table  3:  Interpolation  performance  for  varying  band  limits,  for  s  =  10 


c 

n 

Maximum  Errors 

•^pol 

Roots 

Refined 

Cheb. 

Leg. 

5.0 

13 

0.12E-06 

0.12E-06 

17 

17 

10.0 

18 

0.12E-06 

0.13E-06 

24 

25 

15.0 

22 

0.24E-06 

0.25E-06 

31 

32 

20.0 

26 

0.26E-06 

0.28E-06 

37 

39 

25.0 

30 

0.22E-06 

0.23E-06 

43 

45 

30.0 

33 

0.67E-06 

0.73E-06 

49 

51 

35.0 

37 

0.42E-06 

0.46E-06 

55 

57 

40.0 

41 

0.25E-06 

0.27E-06 

61 

63 

45.0 

44 

0.54E-06 

0.60E-06 

67 

69 

50.0 

48 

0.29E-06 

0.33E-06 

73 

75 

100.0 

82 

0.39E-06 

0.46E-06 

128 

131 

150.0 

115 

0.52E-06 

0.64E-06 

182 

186 

200.0 

147 

0.12E-05 

0.15E-05 

235 

239 

250.0 

180 

0.83E-06 

0.11E-05 

287 

292 

300.0 

212 

0.13E-05 

0.17E-05 

340 

345 

350.0 

245 

0.75E-06 

0.10E-05 

392 

398 

400.0 

277 

0.10E-05 

0.14E-05 

443 

450 

450.0 

309 

0.13E-05 

0.18E-05 

495 

502 

500.0 

341 

0.16E-05 

0.22E-05 

547 

554 

1000.0 

662 

0.16E-05 

0.24E-05 

1058 

1068 

1500.0 

982 

0.15E-05 

0.25E-05 

1566 

1578 

2000.0 

1301 

0.20E-05 

0.35E-05 

2072 

2086 
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Table  4:  Interpolation  performance  for  varying  precisions,  for  c  =  25 


e 

n 

Maximum  Errors 

Apoi 

Roots 

Refined 

Cheb. 

Leg. 

0.10E-01 

21 

0.38E-01 

0.43E-01 

31 

34 

0.10E-02 

23 

0.37E-02 

0.41E-02 

34 

36 

0.10E-03 

25 

0.29E-03 

0.31E-03 

37 

39 

0.10E-04 

26 

0.74E-04 

0.81E-04 

39 

41 

0.10E-05 

28 

0.44E-05 

0.47E-05 

41 

43 

0.10E-06 

30 

0.22E-06 

0.23E-06 

43 

45 

0.10E-07 

31 

0.46E-07 

0.49E-07 

45 

47 

0.10E-08 

32 

0.95E-08 

0.10E-07 

47 

49 

0.10E-09 

34 

0.36E-09 

0.38E-09 

49 

51 

0.10E-10 

35 

0.67E-10 

0.70E-10 

51 

52 

0.10E-11 

37 

0.21E-11 

0.22E-11 

53 

54 

0.10E-12 

38 

0.36E-12 

0.37E-12 

54 

56 

0.10E-13 

39 

0.59E-13 

0.63E-13 

98 

61 

Table  5:  Quadrature  nodes  for  band-limited  functions,  with  c  =  50  and  e  =  10~7 

This  table  contains  only  half  of  the  nodes  and  weights,  in  particular  those  for  which  the 
node  is  less  than  or  equal  to  zero;  reflecting  these  nodes  around  zero  yields  the  remaining 
nodes,  the  weight  for  the  node  at  -x  being  the  same  as  the  weight  for  the  node  at  x. 


Node 

— .9904522459960804E+00 
9525601 106643832E+00 
-.8927960861459153E+00 
— .8186117530609125E+00 
— .7350624131965875E+00 
— .6452878027260844E+00 
— .5512554698695428E+00 
— .4542505281525226E+00 
— .3551568458127944E+00 
-. 25461 73463813596E+00 
— .1531287781860989E+00 
-.5110121484050418E-01 


Weight 

0.2413064234922188E— 01 
0.5024347217095568E— 01 
0.6801787677830858E— 01 
0.7952155999100788E— 01 
0.8706680708376023E— 01 
0.9216240765763570E— 01 
0.9569254015486106E— 01 
0.9817257766311556E— 01 
0.9990914516102242E— 01 
0. 101088017264871 5E+00 
0.1018214308931439E+00 
0.1021735189986602E+00 
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Table  6:  Quadrature  nodes  for  band-limited  functions,  with  c  =  150  and  s  =  10-14 

This  table  contains  only  half  of  the  nodes  and  weights,  in  particular  those  for  which  the 
node  is  less  than  or  equal  to  zero;  reflecting  these  nodes  around  zero  yields  the  remaining 
nodes,  the  weight  for  the  node  at  —x  being  the  same  as  the  weight  for  the  node  at  x. 


Node 

--9982883010959975E+00 

-.9911354691596528E+00 

-  .9788315280982487E+00 
-.9621348937901911E+00 
-.9418386698454396E+00 

-  .9186509576802944E+00 

-  .8931541850293142E+00 

-  .8658083894041821E+00 
1  -  .8369709588254746E+00 

-.S069187108185302E+00 

-  .7758670331396409E+00 

-  .7439849501152674E+00 

-  .71 14064976175457E+00 
!  -.67S2391686910609E+00 

-  .0445701594098660E+00 
!  -  .6104710013384929E+00 
!  -  -57600 10202980960E+00 

-  .5412099413257457E+00 
|  -  .5061398697742787E+00 

-  -4708268134473433E+00 

-  .4353018643598344E+00 

-  -3995921259242572E+00 

-  -3637214481257228E+00 
-.32771 10167114320E-f  00 
-.2915798305819667E+00 

-  .2553450930388687E+00 
-.21 9022536350 1577E+00 
- .  1826266945721476E+00 

-  .1461711362450572E+00 
- .  1 096686661347072E+ 00 
-.7313150339365902E— 01 
— .3657144220122915E— 01 
0 


Weight 

0  4374483371752129E— 02 
0.9842619236149078E— 02 
0.1463518300250369E-01 
0.1862396111287527E— 01 
0.2184988739217138E— 01 
0.2442858670932862E— 01 
0.2648864579258096E— 01 
0.2814375940413615E— 01 
0.2948528624795690E— 01 
0.3058356160435090E— 01 
0.3149181066633766E— 01 
0.3225015506203403E— 01 
0.3288893713079314E— 01 
0.3343126421620424E— 01 
0.3389488931551181E— 01 
0.3429358206877410E— 01 
0.3463812513892117E— 01 
0.3493704033879884E— 01 
0.3519712095895683E— 01 
0.3542382499917732E— 01 
0.3562156808557525E— 01 
0.3579394352776868E— 01 
0.3594388900778062E— 01 
0.3607381381247460E— 01 
0.3618569660385742E— 01 
0.3628116095737887E— 01 
0.3636153393399723E-01 
0.3642789154364812E— 01 
0.3648109393796617E— 01 
0.3652181242257066E— 01 
0.3655054982303338E— 01 
0.3656765531685031E-01 
0.3657333451556860E— 01 
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